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0.  Executive  Summary 

This  Phase  I  SBIR  research  effort  concentrated  on  the  feasibility  of 
creating  the  primary  tools  for  the  prototype  development  in  Phase  II  of  a 
digital  change  detection  vrorkstation.  This  system  is  intended  to  to  be 
capable  of  detecting  long-term  (6  months  to  one  year)  and/or  seasonal 
changes  from  all-source  imagery.  The  system  is  intended  to  be  hosted  on  a 
SUN-4  platform  operating  under  a  UNIX/C  software  environment. 

The  emphasis  of  the  present  effort  was  on  the  two  major  technical 
challenges  for  the  development  of  such  a  system:  precision  image 
registration  and  robust  change  detection  and  analysis.  At  the  direction 
of  the  BTL  custcmier,  most  of  this  effort  was  directed  toward  automated 
SAR-optical  image  registration  and  automated  change  cueing  experiments. 
Change  cueing  is  an  initial  step  in  change  detection  for  identifying 
regions  where  possible  change  events  may  have  occured. 

The  automated  registration  effort  was  successful  over  the  data  sets 
tested.  These  data  sets  did  not  contain  appreciable  terrain-induced 
distortions.  Additional  testing  and  refinement  of  the  present  algorithms 
is  recommended  for  Phase  ll. 

Some  theoretical  developments  for  algorithms  are  made  for  addressing  such 
terrain-induced  conplications  in  Phase  II.  The  implementation,  testing  , 
and  modifications  of  such  algorithms  will  also  be  a  Phase  II  priority. 

However,  a  useful  system  must  always  allow  for  interactive  intervention  by 
a  human  operator.  This  is  because  automated  techniques  are  not  guaranteed 
to  perform  perfectly  100%  of  the  time.  Hierefore,  recommendations  are 
also  made  for  an  interactive  system  to  supplement  the  automated 
registration  techniques. 

The  second  main  effort  concerned  change  cueing.  This  efffort  was  also 
successful  over  the  data  set  which  was  attempted.  However,  more  testing 
must  be  performed  in  Phase  II  over  additional  imagery  sets. 

Change  analysis  is  the  subsequent  step  v4iich  evaluates  such  cued  regions 
as  being  legitimate  changes  or  not.  Change  analysis  was  not  pursued 
during  this  Phase  I  effort,  and  will  only  be  pursued  to  a  limited  degree 
in  Phase  ll.  This  is  because  the  difficulty  of  this  problem  lends  itself 
better  to  an  interactive  approach  with  a  human  operator.  However,  the 
operator  is  expected  to  be  selectively  cued  to  only  a  small  number  of 
possible  change  events. 

An  initial  effort  was  also  included  for  describing  some  of  the  image 
variabilities  that  occur  as  the  the  sensing  scenarios  change. 

Following  a  summary  in  section  7.1  of  results  obtained  so  far,  an  initial 
sxmanary  of  Phase  II  requirements  appears  in  section  7.2. 
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Introduction  to  a>ase  I  Report 

The  Phase  I  Technical  Objectives  are  summarized  in  section  1.1/  an 
overview  of  the  conclusions  appears  in  1.2,  and  the  layout  of  this  report 
is  described  in  section  1.3. 

1.1  Background  and  (&jectives  of  Phase  I  Research 

The  two  technical  objectives  concerned  the  technically  demanding  tasks  of 
precision  registration  and  change  detection  emd  analysis. 

Teobn'ical  Objective  fli  Demonstrate  the  feasibility  of  precision 

registration  of  multi-source  imagery,  both  image-image  as  well 
as  image-map. 

Technical  Objective  >2:  Demonstrate  the  feasibility  of  robust  cheunge 
detection  and  analysis  for  multi-source  imagery. 

The  Phase  I  effort  therefore  concentrated  on  registration  experiments  and 
the  feasibility  of  initial  cueing  for  change  detection.  The  registration 
effort  examined  both  rough  registration  as  well  as  sub-pixel  estimation 
for  SAR-optical  pairs  of  imagery. 

Change  detection  involves  cueing  potential  changes  and  evaluating  these 
cued  regions  as  legitimate  changes  or  as  image  artifacts  due  to  other 
causes,  ihe  Phase  I  research  only  dealt  with  cueing  potential  change 
events.  The  change  events  examined  were  actual  chsmges  over  time  and 
synthetic  changes  inserted  into  imagery. 

The  examination  of  these  issues  was  crucial  for  achieving  the  ultimate 
Phase  II  goal  of  creating  a  workstation  for  autranatic  change  detection  of 
all-source  imagery. 

1.2  Sunnnary  of  Conclusions 

The  image-image  registration  efforts  were  encouraging.  The  two  types  of 
rough  registration  methods  were  area-based  and  contour-based.  These 
algorithms  operated  under  the  assxanption  of  little  terrain  distortion  to 
the  imagery.  A  contour-based  algorithm  for  dealing  with  the  problem  of 
registering  SAR  image  pairs  containing  considerable  terrain  relief  is 
outlined.  Fuller  development  and  testing  of  this  procedure  is  a  Phase  II 
issue. 

A  two-stage  method  of  combining  the  area-based  and  contour-based 
algorithms  showed  some  improvement  over  the  performance  of  each 
separately.  In  particular,  small  residual  amounts  of  rotation  on  the 
order  of  a  degree  were  removed  by  this  procedure. 

Two  sub-pixel  registration  estimation  algorithms  are  also  presented,  with 
extensive  testing  results  so  far  available  for  one  of  them.  The  further 
investigation  of  the  other  algorithm  will  be  undertaken  in  Phase  II. 
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The  following  statistics  summarize  registration  accuracies  achieved  in 
testing: 

o  area-based 

-  K-L  algorithm  (see  section  3.2. 1.1):  +  1  pixel  in  50%  of  cases, 

-  MNF  algorithm  (see  section  3.2. 1.2):  +  1  pixel  in  75%  of  cases, 

o  contour-based:  <  2  pixels  in  all  cases, 
o  combined  area-contour:  +  1  pixel  (using  MNF  area  method) 
o  sub-pixel  registration: 

-  Algorithm  #1  (see  section  4.1.1):  ~  5x10“^  pixel 

(using  lower  frequencies,  and  without 
'  ‘  .  spectral  leakage  filtering) 

-  Algorithm  #2  (see  section  4.1.2):  '<  10"^.  pixel  (best  results 
using  Heunning  filter  for  filtering  spectral  leeJcage) 

The  sub-pixel  results  were  obtained  on  simulated  data. 

The  other  main  technical  area  investigated  was  change  cueing  on  the  pixel 
level,  given  a  registered  image  pair.  Both  actual  ch2tfiges  and  simulated 
changes  were  examined.  The  results  of  testing  revealed  excellent  cueing 
even  for  small  (two  pixels)  targets  as  long  as  the  local  image-image 
correlation  of  the  background  was  high,  with  the  performance  degrading  as 
this  correlation  decreases. 

From  a  signal  processing  standpoint,  this  performance  dependence  on  the 
background  correlation  levels  is  unavoidable  for  pixel  level  procesing. 
However,  considerably  more  progress  can  be  achieved  using  processing 
methods  v4iich  effectively  increase  these  background  corelation  levels  by 
restricting  attention  to  selective  frequency  regions. 

Progress  beyond  what  can  be  achieved  using  such  enhanced  pixel-level 
processing  would  probably  require  higher-level  procedures  on  the  object- 
level.  Such  methods  will  be  required  to  make  hypotheses  on  the  existence 
of  objects  based  on  pattern  analysis,  as  opposed  to  sinply  thresholding 
based  on  local  statistics. 

Despite  the  encouraging  results  using  the  automated  methods,  it  is 
strongly  recomnended  that  the  Phase  II  workstation  contain  capabilities 
for  interactive  as  well  as  automated  modes  of  operation.  Such  a  dual 
capability  allows  the  use  of  a  human  operator  to  examine  and  assess  the 
results  generated  by  automated  registration  procedures,  make  corrections 
if  needed,  and  to  provide  initial  offsets  for  difficult  or  ambiguous 
cases. 

For  change  detection,  the  use  of  automated  change  cueing  requires  only 
selective  attention  by  the  operator,  but  uses  his/her  superior  judgement 
for  evaluating  cues  as  legitimate  targets. 
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1.3  Orqamization  of  the  Report 


The  pre-processing  methods  appear  in  section  2,  initial  rough  and  sub¬ 
pixel  registration  methods  are  in  sections  3  emd  4,  and  change  cueing  is 
discussed  in  section  5.  Some  observations  on  the  variability  of  these 
procedures  for  changing  imaging  scenarios  is  presented  in  section  6. 

Finally,  some  conclusions  on  the  practicality  of  these  methods  are 
discussed  in  section  7. 
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2.  Pre-Processing 


2.1  Characterization  of  imagery  Set 

Any  study  of  registration  and  change  detection  methods  for  multi-source, 
multi-spectral  imagery  first  re<iuires  compiling  a  representative  set  of 
imagery  containing  the  relevant  objects  of  interest.  More  precisely  for 
change  detection,  temporal  data  sets  depicting  the  same  region  at 
different  time  periods  are  vAiat  is  needed.  Additionally,  multi-source  or 
multi-spectral  temporal  data  sets  provide  opportunities  for  registration 
and  change  detection  experiments  between  imagery  types. 

During  the  present  Phase  I  research,  the  effort  concentrated  on  matching 
auid  change  detection  between  synthetic  aperture  radar  (SAR)  and  electro- 
optical  (EO)  Landsat  TM  imagery. 

One  multi-source  data  set  consists  of  the  JPL  quadpole  aircraft  SAR  of  the 
Raisin  City  CA  site  and  collateral  coverage  from  7  bands  of  Landsat  TO. 
The  SAR  and  TO  data  represent  imaging  times.  Also,  SIR-B  imagery  of  the 
same  site  was  available. 

A  second  data  set  consists  of  SEASAT  SAR  and  7  bands  of  TO  data  of  the 
Yuma  AZ  site.  Again,  the  SAR  and  TO  data  were  imaged  at  different  times. 

The  Landsat  TO  data  was  resampled  to  match  the  resolution  of  the  SEASAT 
data  for  registration  experiments. 


Table  2-1  Imagery  set. 


Sensor 

Region 

Resolution 
(range  x  azimuth) 

Wavelength 

Polarization 

Flying  Altitude 

JPL  quad-pole 
SAR 

Raisin  City,  CA 

7.49m  X 

10J8m 

L-Band 

complex 

quad^le 

12.22  km 

SIR-B  SAR 

Raisin  City,  CA 

34.6m  X 

28.5m 

L-Band  23  cm 

H-H 

235  km 

SEASAT 

SAR 

Yuma,  AZ 

25m  X 

25m 

L-Band  24  cm 

H-H 

800  km 

Landsa(-4  TM  Raisin  City,  CA 
Yuma,  AZ 

*  band  7  Landsat  resolution  120m 

30m  X 

30m* 

band  1 .45pjn  -  ,524m 
band  2 .524m  -  .604m 
band  3  .634m  -  .694m 
band  4  .764m  -  .90jun 
band  5  1.554m  -  1.754m 
band  6  10.44m  -  12.54m 
band  7  2.084m  -  2354m 

N/A 

705  km 

Table  2-1:  Resolutions  for  Sensor  Types 
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The  resolution  of  the  various  data  sets  is  given  in  Tad^le  2-1.  Examples 
of  the  releveunt  imagery  appear  in  Fig.  2-1  to  2-5. 

The  JPL  quadpole  data  set  involves  independent  measurements  for  the 
horizontal  and  vertical  polarized  SAR  returns.  Prom  these  data,  one  can 
calculate  the  radar  returns  arising  from  any  hypothetically  transmitted 
linearly  polarized  signals.  Recall  that  the  electric  (E)  field  of  an 
electromagnetic  (EM)  wave  instantaneously  resides  in  a  plane  of  vibration. 
If  this  pleuie  of  vibration  is  constauit  we  say  that  the  EM  wave  is  linearly 
polarized. 

The  most  general  case  is  that  this  vibration  plane  rotates,  with  the 
magnitude  of  the  E-field  vector  tracing  an  ellipse  when  projected  into  a 
plane  perpendicular  to  the  propagation  vector.  In  this  case  we  say  the  EM 
wave  is  elliptically  polarized.  A  special  case  of  elliptic  polarization 
is  circular  polarization. 

Elliptically  polarized  waves  can  be  considered  as  the  superposition  of  two 
orthogonal  linearly  polarized  waves  with  a  phase  difference.  Therefore, 
mauny,  though  not  all,  analyses  of  polarization  involve  only  the  linear 
case. 

A  formalism  for  representing  the  outcome  of  scattering  coherent,  polarized 
waves  is  by  the  use  of  the  scattering,  or  Jones,  matrix  [s].  When  post- 
multiplied  by  the  incident  electric  field,  or  Jones  vector,  the  resultant 
scattered  electric  field  is  determined,  ie: 


(t)- 

'Si  1  S.  J  ■ 

■E<‘>(t)- 

LE*’Mt). 

■Sjl  s„. 

.E<*>  (t). 

Another  formalism  for  representing  incoherent  scattering  involves  the  four 
component  Stokes  vector  (S)  and  the  (4x4)  Mueller  (also  called  phase  or 
Stokes)  matrix  (F).  Operationally,  incoherent  scattering  is  described  in 
an  ensemble  sense  using  matrix  multiplication: 


■S*o' 

■Fii 

F 

'12 

^13 

Fi4' 

■sr 

s! 

^2  2 

^2  3 

F24 

n 

S! 

F31 

^3  2 

^3  3 

F34 

S§ 

-SI- 

■^4  1 

^4  2 

F43 

F'44- 

.S§. 

Backscatter  from  terrain  involves  depolarization  and  incoherent  retxirns. 
Instantaneously,  loss  of  polarization  does  not  exist,  ie.  it  is  inherpritiy 
a  time-averaged  concept.  Because  the  scattering  matrix  formalism  always 
involves  full  polarization  and  conplete  coherence,  it  is  an  instantaneous 
concept  and  cannot  be  used  in  an  ensemble  sense.  For  this  reason  and 
because  of  noise  reduction,  most  polarimetry  data  is  in  the  form  of  the 
Mueller  matrix. 
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As  described  in  [Zebker  et  al,87],  the  JPL  imaging  radar  polariraeter 
measures  both  intensity  and  relative  phase  of  radar  backscatter  as  a 
function  of  transmitted  and  received  polarizations.  This  was  accomplished 
by  adding  a  dual-polarized  antenna  and  a  four  channel  data  system  to  the 
JPL  aircraft  SAR. 

The  amplitudes  and  phases  of  all  elements  of  the  scattering  matrix  are 
therefore  are  measured  for  each  pixel  in  the  SAR  image.  This  allows  the 
synthesis  of  any  combination  of  transmitted  and  received  2mtenna 
polarizations.  However,  as  pointed  out  in  iLukert  euid  Blanchard, 88 ] ,  such 
accurate  depolarization  measurements  require  adequate  antenna  polarization 
isolation.  Otherwise  like-polarization  feed-through  can  exist  in  the 
depolarized  channel,  and  this  effect  cam  couple  with  PRF-related  azimuth 
eunbiguities  to  affect  pixels  other  than  the  one  being  processed. 

Another  consideration  is  the  use  of  the  Mueller  matrix  rather  than  the 
scattering  matrix.  Because  of  statistical  variations  caused  by  noise, 
spatial  averaging  of  radar  power  measurements  is  used  at  a  cost  of  reduced 
resolution.  In  such  a  case,  the  assumption  of  uncorrelated  phases  of  EM 
waves  from  individual  scatters  allows  the  (incoherent)  Stokes  vector  of 
the  sum  of  these  waves  to  be  the  sum  of  their  individual  Stokes  vectors. 
This  allows  the  summation  of  their  individual  Mueller  matrices  to  form  a 
composite  Mueller  matrix  for  the  composite  scatter  composed  of  individual 
scatters. 

For  a  fuller  discussion  of  polarization  concepts  and  definitions,  see 
Appendix  9.2. 
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Fig,  2-3:  SIR-B  SAR,  Raisn  City  CA  Site 


2.2  Optical  Images 

Several  image  pre-processing  algorithms  suitable  for  optical  imagery  were 
investigated.  These  included  procedures  for  enheincing  edges,  reducing 
noise,  and  increasing  the  visual  similarity  of  the  optical  and  SAR  images 
to  be  matched. 

The  strategy  for  matching  SAR  to  optical  imagery  was  to  preferentially  use 
the  lower  noise  optical  images  to  cue  features  for  matching  in  the  SAR 
images.  Certain  features,  particularly  those  with  long  linear  signatures, 
can  often  be  segmented  from  optical  imagery  more  easily  and  directly  thaui 
from  SAR  data.  Several  pre-processing  schemes  were  tested  with  this  idea 
in  mind. 

For  edge  enh^ulcement ,  a  method  which  uses  the  sigma  filter  [Lee, 83]  was 
found  to  yield  well-defined  edge  structures.  The  sigma  filter  is  based  on 
the  use  of  a  Gaussian  distribution  model  for  the  pixel  intensity  values  in 
a  local  neighborhood.  The  assumption  is  that  only  those  pixels  whose 
intensity  values  lie  beyond  two  steindard  deviations  (2®)  from  the 
neighborhood  sanple  mean  conprise  another  population.  This  population 
beyond  2-a  represents  either  an  edge  structure  or  shot  noise,  vhereas  the 
population  within  2-a  represents  ui  unchanging  intensity  field,  generally 
corrupted  by  speckle  noise. 

Therefore  only  pixels  within  the  2-a  region  are  averaged  to  reduce  the 
effects  of  sp«ckle  noise,  resulting  in  an  intensity-smoothed  pixel.  An 
ad-hoc  thresholding  procedure  is  used  to  infer  the  presence  of  shot  noise, 
follovred  by  smoothing  of  those  pixels  involving  only  their  4  nearest 
neighbors. 

The  present  edge  detection  procedure  uses  the  sigma  filtered  image  and  the 
original  image  together.  The  pixel-by-pixel  differences  of  the  two  images 
produces  a  well-defined  edge  map  with  stronger  edges  having  brighter 
intensities  thein  the  edges  in  the  original  image. 

This  enhancement  occurs  because  of  the  smoothing  of  flat  regions  along 
with  a  simultcineous  identity  operation  applied  to  the  edges.  The  image 
difference  operation  thus  results  in  amplification  immediately  next  to 
those  areas  having  eibrupt  changes  and  are  therefore  cheuiged  by  the  sigma 
filter,  vdiereas  flatter  regions  are  suppressed  because  of  their  similarity 
in  both  images. 

Note  that  the  original  edges  themselves  are  not  modified  by  the  sigma 
filter,  and  thus  are  also  suppressed  in  the  difference  image.  Therefore, 
the  resulting  edges  in  the  difference  image  are  edges  which  are  sliohtly 
displaced  on  the  order  of  one  pixel. 

In  addtion,  it  was  discovered  that  by  allowing  the  byte  values  of  the 
image  to  wrap  in  the  difference  of  the  original  image,  it  was  possible  to 
obtain  an  essentially  binary  image.  This  occurred  because  the  edges  in 
the  sigma  filtered  image  were  slightly  brighter  than  in  the  original 
image.  As  a  result  of  the  differencing  process,  these  locations  were 


13 


assigned  a  value  slightly  less  than  zero  vdiile  all  other  locations  were 
assigned  a  grey  value  slightly  greater  than  zero.  The  result  was  an  image 
whose  histogram  was  dominated  by  the  grey  values  near  zero  and  near  255, 
with  no  no  pixels  having  grey  values  in  between. 

A  sigma  filter  edge-enhanced  Landsat  TM  betnd  4  image  of  the  Yuma  site  and 
one  of  the  Raisin  City  site  are  shown  in  Fig.  2-6.  The  effect  of  the 
wrapping  con^red  to  the  clipped  difference  process  can  be  seen  in  the 
images.  The  grey  values  of  the  clipped  images  have  been  scaled  to  enhance 
the  dynamic  range  of  the  images. 
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Figure  2-6  Sigma  filter  edge  enhanced  images  of  Yuma  (a) 
bit  -wrapped  (b)  clipped  at  zero  in  the  difference,  (c-d) 
similar  images  of  Raisin  City. 
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Sobel  edge  filter  enhanced  and  sigma  filter  smoothed  Landsat 
Band  4  image  of  Raisin  City  site. 


The  spirit  of  this  method  is  compatible  with  the  concern  voiced  in  [Adair 
and  Guindon,90]  that  the  usual  independent  applications  of  noise 
suppression  and  edge  enhancement  should  not  be  de-coupled.  The  essential 
point  is  that  the  homogeneity  criteria  for  the  adaptive  smoothing  operator 
should  be  consistent  with  the  edoruptness  detection  process  in  the  edge 
operator. 

Another  method  for  edge  enhancement,  the  Sobel  operator  [Ballard,82] ,  was 
used  effectively  as  a  gradient-type  edge  detector  in  bands  1-5  of  the 
Landsat-4  TM  data  sets.  The  result  of  the  sobel  operator  applied  to  a 
sigma  smoothed  Landsat-4  image  of  the  Raisin  City  site  is  shown  in  Fig.  2- 

7. 

Continued  effort  in  Phase  II  will  be  concerned  with  segmentation  of  edges 
euid  contours  using  such  ideas. 

2.3  SAR  Images 

Past  work  in  the  field  of  SAR-optical  registration  has  emphasized  the 
importauice  of  intensity  pre-processing  the  SAR  image  prior  to  matching 
with  an  optical  image.  SAR  images  tend  to  have  lower  signal  to  noise 
ratios  euid  smaller  dynamic  ranges  than  optical  images.  Another 
complicating  factor  is  that  SAR  images  are  involve  multiplicative  noise, 
whereas  passive  optical  imagery  is  corrupted  by  additive  noise. 

However,  as  argued  in  [Frost  et  al,82],  a  more  realistic  noise  model  for 
SAR  image  processing  is  multiplicative-convolved  noise.  This  is  because 
the  product  of  the  terrain  backscatter  and  coherent  fading,  ie.  the 
speckle  process,  is  convolved  with  the  radar  system's  point  spread 
function.  Therefore,  a  straightforward  application  of  the  logarithm  of 
the  SAR  image  will  not  necessarily  separate  signal  and  noise  into  additive 
components.  Direct  deconvolution  of  the  signal  from  this  convolved 
speckle  noise  process  is  generally  computationally  inaccurate  because  of 
the  low  signal  to  noise  ratios  common  to  SAR  images.  Because  of  this 
difficulty,  most  SAR  image  processing  is  simply  performed  on  the  basis  of 
a  multiplicative  noise  model  even  though  it  is  only  approximately  correct. 

Some  of  the  methods  for  noise  reduction  in  SAR  images  include  locally 
adaptive  signal  estimation  [Frost  et  al,82],  [Lee, 83].  These  methods 
attempt  to  estimate  the  signal  locally,  thereby  distinguishing  meaningful 
grey- value  transitions  from  noise-induced  changes.  Appendix  A  contains  a 
fuller  account  of  such  locally  based  noise  reduction  techniques. 

Depending  on  the  matching  methodology  to  be  used,  various  nonlinear 
adaptive  spatial  filters  may  be  used.  For  SAR  images  with  a  large  amount 
of  noise  and  speckle  content,  such  as  is  the  case  with  SEASAT  and  .'-IR-B 
imagery  as  well  as  to  a  lesser  extent  with  aircraft  imagery,  an  adaptive 
filter  reduces  the  effects  of  noise  while  still  preserving  the  integrity 
of  edges.  This  retention  of  the  edge  structure  is  vital  to  our  contour- 
based  matching  algorithm. 
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One  such  filter  is  the  sigma  filter,  described  in  section  2.2.  VEXCEL  has 
used  the  sigma  filter  for  filtering  SEASAT  arctic  imagery  on  another 
project  [ McConnell, 87 ] .  This  filter  was  used  to  filter  the  image  prior  to 
successfully  thresholding  ice  floes  using  histogram  techniques. 

The  sigma  filter  uses  a  local  statistical  model  for  the  flat  and  edge 
popula*‘ions  in  the  SAR  image.  Increasing  the  values  of  sigma  lead  to 
images  with  increasing  amounts  of  smoothing,  as  shown  in  Fig.  2-8. 
Recommendations  for  sigma  values  based  on  naximizing  the  signal  to  noise 
ratio  appear  in  (Lee, 83]. 

Another  edge  preserving,  adaptive  noise  smoothing  algorithm  [Nagao  et 
al,79]  uses  a  set  of  nine  masks  for  each  pixel.  Each  such  mask  covers  a 
select  group  of  the  neighbors  of  a  given  pixel.  The  variance  of  the  pixel 
intensities  of  each  mask  is  calculated,  amd  the  mean  of  the  pixel  mask 
with  the  minimum  variance  is  substituted  for  the  original  pixel  value. 

An  extension  of  the  sigma  i’lter  to  quadpole  SAR  imagery  is  given  in 
(Lee, 90].  The  extension  of  such  adaptive  smoothing  techniques  to  data 
sets  of  polarized  SAR  imagery  represents  a  promising  approach  for 
increasing  the  signal  to  noise  ratio  of  the  filtered  images. 

An  algorithm  based  on  relaxation  (Rosenfeld,82]  was  attempted  for 
reinforcing  "smooth"  contours  formed  by  local  edges.  The  algorithm  is  not 
rigorous  in  a  probabilistic  sense  and  does  not  compute  updated 
probedDilities  in  according  to  Bayes  theorem.  Nevertheless,  it  is 
appealing  in  a  heuristic  sense,  and  seems  to  correspond  to  intuitive 
thinking  about  locally  updating  probabilities  of  an  edge  being  significant 
based  on  its  immediate  surroundings. 

This  algorithm  is  briefly  given  below; 

1)  Estimate  initial  probedailities  p(ifj)  based  on  the  likelihood  that  a 
pixel  is  a  member  of  class  j.  Each  pixel  in  the  window  of  n  pixels 
is  pixel  i.  There  are  m-4  classes  j-l,...4,  corresponding  to  the 
four  edge  orientations  0",  45®,  90®,  135®.  The  sense  of  the  ^ges  is 
ignored  at  this  point  in  this  processing  for  simplicity. 

2)  For  each  point,  determine  the  compatibility  of  each  edge  class 
probeibility  with  that  of  its  eight  neighbors.  Thus,  construct 
c(i,j;h,k),  the  compatibility  of  pixel  i  wiith  class  j,  and  pixel  h 
with  class  k.  The  pixels  each  have  probabilities  p(i,j),  and  p(h,k) 
of  being  a  member  of  the  specified  class.  There  are  eight 
compatibilities  h-l,...,8  for  each  class  of  pixel  i.  Therefore,  for 
each  pixel  there  are  32  compatibilities,  ie.  4  for  each  of  8 
neighbors. 

3)  Update  probcibilities  by  first  computing  q(i,j),  which  is  the  mean  of 
the  sum: 

q(i,j)  -  (l/n-DZ;;,!  {c(i,  j)  p(h,k)} 
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Then  the  new  updated  probability  is  given  by: 

-  P(i.j)oid(l+<3(i» j)) 

This  algorithm  did  not  work  well  in  its  present  form  for  SAR  imagery.  The 
results  are  discussed  in  section  2.5. 

Methods  enployed  for  intensity  re-mapping  include  histogram  equalization 
using  the  cmnulative  distribution  functions  of  both  images 
[Castelenan,79] ,  histogram  re-mapping  using  the  Karhunen-Loeve  Tr2uisform 
[Wong, 77],  eind  correction  using  a  sensor-dependent,  exponential  model  for 
the  optical  image  [Wong, 80). 

2.4  Infrared  Images 

The  NASA  airborne  Thermal  Infrared  Multispectral  Scaunner  (TIMS)  has  six 
wavelength  chainnels  in  the  long  wave  infrared  (8-12  ricrons)  region  of  the 
electromagnetic  spectmmi.  It  has  an  instauitauieous  field  of  view  of  2.5 
mrad,  a  total  field  of  view  of  80  degrees  and  a  nominal  ground  resolution 
of  10  meters.  Table  2-2  shows  the  specifications  of  the  six  LWIR  TIMS 
bands. 

We  havt  acquired  two  daytime  near-nadir  TIMS  images  of  Death  Valley  in  the 
Mojave  Desert  of  California  from  the  Jet  Propulsion  Laboratory  —  one 
taken  in  June  of  1983  and  the  other  in  July  of  1988.  Sortie 
specifications  are  shown  in  Table  2-3.  The  respective  Images  for  band  1 
are  shown  in  Figures  2-9(a)  and  2-9 (b). 


Band 


Wavelength  Limits 
(roicronsT 


Bandwidth  FViHM 
( microns y 


1 

2 

3 

4 

5 

6 


8.2 

-  8.6 

0.4 

8.6 

-  9.0 

0.4 

9.0 

-  9.4 

0.4 

9.4 

-  10.2 

0.8 

10.2 

-  11.2 

1.0 

11.2 

-  12.2 

1.0 

Table  2-2:  TIMS  Wavebands 


Sortie  Date 
Start  Time 
End  Time 
Start  Latitude 
End  Latitude 
Start  Longitude 
End  Longitude 


Death  Valley  1 

6/02/88 
20:59:53 
20:07:43 
36D  44M  48S  N 
36D  OM  OS  N 
117D  IM  24S  W 
116D  51M  54S  W 


Death  Valley  2 

7/22/83 
18:43:38 
18:46:30 
36D  23M  48S  N 
36D  24M  12S  N 
116D  45M  12S  W 
117D  IM  24S  W 


Table  2-3:  Specifications  for  Death  Valley  Flights 
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Death  Valley  is  a  long  narrow  valley  floor  bounded  by  the  Paneunint  Range 
on  the  west  aund  the  Funeral  and  Black  mountains  on  the  east,  which  consist 
of  deformed  metamorphic,  volcanic,  and  sedimentary  rocks.  The  floor  of 
Death  Valley,  which  includes  the  lowest  elevation  in  the  United  States,  is 
a  closed  salt  pan  covered  by  evaporite  deposits  —  the  dark,  delineated, 
stream-like  forms  near  the  right  side  of  Figures  2-9(a)  and  2-10(b). 
These  intermittent  streams  on  the  valley  floor  are  bordered  by  floodplain 
deposits  of  clay  and  slit.  The  valley  floor  is  bordered  by  alluvial  fan 
deposits  of  gravel  eroded  from  the  adjacent  mountains  —  these  are  the 
round  skirt-like  forms  to  either  side  of  the  valley  floor.  On  the 
Panamint  mountains,  to  the  west  (left)  of  the  featureless  alluvial  gravel 
fans,  bedrock  is  exposed  as  intrusive  bright  forms  (volcauiic  rocks)  and 
rough,  jagged  dark  forms  (carbonate  rocks).  Also  of  interest  are  two  man¬ 
made  features  near  the  upper  right  (north-east)  of  Figure  2-9(b):  Furnace 
Creek  R^u^ch  (a  mottled  square  dark  feature)  euid  Furnace  Creek  airstrip  (  a 
long  thin  feature  oriented  in  the  north  to  south  direction).  The  only 
common  m2ui-made  feature  between  these  two  images  is  the  airstrip. 

It  is  evident  from  Figures  2-9(a)  and  2-9(b)  that  the  Death  Valley  images 
have  different  scales  eind  are  grossly  mis registered.  The  first  step  in 
the  preprocessing  was  therefore  to  perform  a  global  registration  based  on 
a  set  of  54  manually  located  control  points  in  the  two  images.  For  each 
image  dimension,  the  measured  displacements  between  corresponding  control 
points  were  fit  to  a  thin-plate  cubic  spline  and  interpolated  to  generate 
a  warping  surface  (one  for  each  image  dimension).  The  local  2-D 
misregistrations  provided  by  these  surfaces  were  then  used  to  resample 
each  of  the  images  using  a  sliding-window  cubic  B-spline  kernel  (discussed 
in  Section  4.3).  The  globally  registered  images  for  TIMS  Band  1,  shown  in 
Figures  2-10(a)  and  2-10(b),  visually  appear  to  be  quite  well  aligned. 
The  same  global  registration  step  was  also  applied  to  the  TIMS  images  in 
Bands  3  and  5,  so  that  the  three  bands  for  each  observation  could  be 
processed  together. 

The  apparent  similarity  of  multi-band  infrared  data  sets  is  sometimes 
greater  in  certain  principle  spectral  conponents  them  in  the  original 
bemds  themselves,  due  to  enhancement  of  spectrally  distinct  features  in 
the  data.  Figures  2-11  through  2-13  compare  the  three  respective 
principle  components  computed  from  Bands  1,  3  and  5  (after  global 
registration)  for  each  of  tlie  two  Death  Valley  data  sets.  The  long-term 
correspondence  between  certain  features  in  Conponents  2  emd  3  (Fig.  2-12 
and  2-13)  is  particularly  striking.  Note  also  that  the  Furnace  Creek 
airstrip  is  clearly  visible  in  both  second  component  images  as  a  somewhat 
crooked  dark  line  near  the  upper  right-hand  edge  of  the  frame.  Since  the 
airstrip  is  known  to  be  a  linear  feature,  this  shape  indicates  the 
presence  of  residual  global  registration  error  near  the  right  frame  edge. 

2.5  Pre-Processing  Results 

The  SAR  imagery  used  in  the  present  effort  contains  a  large  amount  of 
speckle  noise.  The  SIR-B  imagery  had  the  smallest  signal  to  noise  ratio 
emd  was  more  dominated  Ijy  speckle  than  the  other  SAR  types.  Moreover,  the 
smaller  look  angle  of  the  space  platform  led  to  a  higher  degree  of 
geometric  distortions. 
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(a)  Death  Valley  1  (1983)  (b)  Death  Valley  2  (1988) 
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Fig.  2-11.  First  Principle  Component  Images  from  TIMS  Bands  1,3.5 


-14 ;  Sigma  filtered  SEASAT  SAR,  smoothing  parameter  s*.25,  5x5 
pixel  neighborhood. 
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Sigma  filtered  SIR-B  SAR,  smoothing  parameter  s-.l,  5x5 
pixel  neighborhood. 
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Fiq.  2-16:  Sigma  filtered  JPL  (H-H)  SAR,  smoothing  parameter  s=.l,  5x5 

neighborhood. 
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•  !-« 


filtered,  histogram  remapped  SEASAT  SAR, 
ter  s=  .25,  5x5  neighborhood. 


smoothing 
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The  sigma  filter  was  used  extensively  for  smoothing  the  SEASAT,  SIR-B,  and 
JPL  aircraft  quadpole  SAR  data  sets.  Several  variations  on  the  parameters 
for  filtering  were  tried.  Fig.  2-14  shows  a  sigma-filtered  SEASAT  image 
of  the  YUMA  scene.  Fig, 2-15  and  2-16  are  sigma-filtered  images  of  the 
Raisin  City  for  SIR-B  and  JPL  quadpole  sensed  data. 

In  order  to  increase  the  amenability  of  the  SAR-optical  matching  process, 
a  global  histogram  remapping  was  applied  to  the  SEASAT  data  based  on  the 
histogram  of  the  corresponding  optical  image.  This  global  filtering  step 
was  then  followed  by  applying  the  locally  adaptive  sigma  filter.  The 
resulting  image  is  shown  in  Fig.  2-17. 

Clearly,  these  sigma-filtered  images  show  inprovements  in  noise  content 
and  the  histogram  remapping  operation  improved  the  visual  similarity  of 
the  SAR  image  to  the  optical  (see  Fig.  2-1  auid  2-2). 

Unfortunately,  the  relaxation  procedure  for  selectively  enphasizing  long 
contours  [Rosenfeld,82  ]  did  not  work  well  with  SAR  imagery.  In 
particular,  most  edges  were  eventually  reinforced  and  strengthened, 
creating  a  confusing,  cluttered  image  instead  of  selectively  segementing 
only  major  contours. 

It  is  believed  that  this  problem  is  because  small  e<..ge  structures  that 
initially  get  enphasized  are  allowed  by  the  algorithm  to  persist  as  the 
number  of  iterations  increases.  Because  of  the  cluttered  nature  of  the 
edges  in  a  SAR  image,  most  small  edge  structures  do  get  emphasized  during 
the  initial  iterations. 

This  suggests  that  a  more  successful  algorithm  should  be  successively 
taking  into  account  larger  neighborhoods  that  are  directionally  oriented 
as  the  iterations  progress,  instead  of  always  using  a  neighborhood  of 
fixed  size. 

Another  modification  that  may  be  considered  for  pre-processing  prior  to 
SAR-optical  matching  is  to  only  reinforce  those  directions  that  are 
contained  in  a  particular  streak  S  that  has  been  extracted  in  the  optical 
image  as  part  of  a  feature  (the  preferential  use  of  streaks  extracted 
first  from  the  optical  image  is  discussed  in  section  3. 2. 2.1).  Then 
perform  this  directionally  selective  reinforcement  sucessively  for  all 
candidate  streaks  S  in  the  SAR  image  which  are  in  the  region  of 
registration  ambiguity  for  S.  Moreover,  the  compatibilities  C(i,j;h,k), 
discussed  in  section  2.3,  should  be  computed  directly  from  their  neighbors 
in  S  using  histogreuns,  instead  of  from  the  compatibility  table  presently 
used. 

These  modifications  will  be  examined  in  Phase  II. 
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Approximate  Registration  of  SAR-Optical  Imagery 

Image-image  registrtion  methods  occur  at  three  scales  of  accuracy: 

o  Very  rough  registration  using  geocoding  methods,  resulting  in  errors 
of  multiple  pixels, 

o  Approximate  registration  using  pattern  information  in  the  images, 
resulting  in  errors  on  the  order  of  1  pixel, 

o  Sub-pixel  offset  estimation,  followed  by  interpolation  and  image 
resampling,  resulting  in  residual  fractional  pixel  errors  less  than 
the  true  sub-pixel  offsets. 

Three  main  types  of  approximate  registration  methods  are  discussed  in  this 
section.  They  are  area-based,  contour-based,  aund  combined  methods  using 
both  areas  as  well  as  boundaries  for  matching. 

3.1  General  Issues  and  Strategies 

We  will  classify  feature  matching  techniques  as  region-based  or  contour- 
based,  euid  an  overview  of  methods  is  presented  in  sections  3.1.2  and  3,1.3 
respectively.  Particular  new  methods  developed  for  the  present  effort 
appear  in  sections  32  I  and  3.2.2  respectively.  Section  3.1.1  presents  a 
review  of  the  diff4.oL..v:ies  in  matching  dissimilar  imagery. 

In  either  case,  laatching  requires  some  analytic  metric  for  making  ranked 
comparisons  cL  good,  poor,  and  ambiguous  matches.  The  challenge  is  to 
find  metr-'.cs  which  ere  robust  with  respect  to  a  wide  variety  of 
illumination  and  viewing  conditions. 

One  initial  approximate  registration  method  is  to  simply  use  a  reference 
geoid  amd  geocode  both  images  using  any  navigational  information,  such  as 
GP3  or  INS. 

If  map  data  are  available,  either  as  terrain  (DTED),  feature  (DFAD),  or 
both,  then  one  approach  Cein  employ  the  geocoding  methods  described  in 
[Kober  et  al,88I  as  an  initial  step  for  approximate  dissimilar  image-image 
registration.  The  rule-based  "expert"  assistants  described  in  [Curlander 
et  al,89J  would  then  be  used  to  select  appropriate  control  features  in 
both  the  optical  and  SAR  images. 

More  accurate  SAR-optical  registration  would  then  proceed  on  the  basis  of 
grey-level  information  in  the  two  images.  Of  course,  in  the  absence  of 
any  preliminary  geocoding,  dissimilar  image  registration  would  proceed  on 
the  basis  of  such  pattern  information  alone.  The  following  discussions 
concern  matching  based  soley  on  pattern  information  within  the  images  to 
be  registered. 

3.1.1  Overview  on  Difficulties  in  Matching  Dissimilar  Imagery 
Precision  registration  of  SAR  and  optical  images  is  potentially  difficult 
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because  of  both  terrain-induced  geometric  distortion  euid  radiometric 
differences  between  the  two  sensor  types  (see  Fig.  1).  Compounding  this 
problem  is  the  local  decorrelation  of  gray  values  within  a  single  SAR 
image.  This  smaller  correlation  length  stems  from  the  large  amount  of 
speckle  appearing  in  SAR  imagery.  Such  speckle  effects  are  always  present 
in  coherent  imaging  systems  whenever  there  is  scene  microstructure  on  a 
smaller  scale  than  the  sensor  resolution. 

The  difficulties  for  registration  involve  both  geometric  and  radiometric 
disparities.  One  radiometric  problem  source  is  speckle,  vdiich  has  been 
discussed  in  section  2.3.  Two  remaining  major  factors  contributing  to 
errors  vrtiich  make  precise  registration  difficult  are:  nonlinearities 
associated  with  terrain  cind  low  contrast  regions,  local  contrast 
reversals,  auid  edge  migration. 

As  discussed  in  [Ramapriyeun  et.  al.,86],  grey  value  differences  among 
overlapping  SAR  images  are  sensitive  to  range  variations  both  within  euid 
among  the  images,  as  well  as  incidence  angle  differences. 

Relative  geometric  distortions  between  two  overlapping  images  will  be 
accentuated  by  rapidly  varying  terrain  relief,  resulting  in  compression 
and  stretching  of  corresponding  regions  and  distortions  of  grey  level 
boundaries.  Geometric  compression  actually  involves  the  accumulation  of 
returns  from  multiple  terrain  resolution  elemen..s  into  the  same  rawige 
cell.  Because  of  this  accumulation  of  returns,  the  radiometric  value  of 
the  affected  pixels  can  vary  widely  in  one  image  from  the  corresponding 
pixels  in  the  other  image. 

An  extreme  example  of  discontinuous  distortion  includes  the  phenomenon  of 
layover.  The  layo^'er  effect  results  in  certain  objects  which  are 
planimetrically  further  distant  from  the  sensor  being  imaged  with  a 
smaller  range  than  nearby  objects  plemimetrically  closer  to  the  sensor. 
This  occurs  because  of  very  abrupt  tremsitions  in  height,  as  for  example 
with  flagpoles.  Such  discontinuities  will  almost  always  cause  errors  in 
automated  matching,  unless  the  affected  feature  is  very  small  compared  to 
the  matching  window  or  specialized  logic  for  detection  of  overlays  is 
used. 

Opposite-side  effects  in  SAR  images  may  be  caused  by  changes  in  local 
incidence  angles  eind  environmental  changes,  both  of  \diich  cem  result  in  a 
changed  radar  scattering  cross-section,  and  result  in  nonlinear  grey  value 
mappings  for  corresponding  regions.  What  often  occurs  in  these  cases  is 
called  "contrast  reversals".  These  local  contrast  reversals  are  not 
predictable  from  one  image  to  the  next,  and  can  cause  problems  for  area 
correlation  schemes  if  the  affected  regions  are  a  sizeable  fraction  of  the 
matching  window.  In  fact,  differences  in  coresponding  local  incidence 
angles  can  cause  mismatches  to  occur  even  for  manually  (  ii'icpn 
correspondences  [Leberl,86],  [ Fullerton, 86 ) . 

Low  contrast  regions  with  superimposed  speckle  noise  and  specular  effects 
can  cause  problems  for  area-based  matching  metrics.  In  such  cases,  the 
signal-to-noise  ratios  (S/N)  are  reduced.  In  the  case  of  correlation 
metrics,  such  situations  often  result  in  broad,  relatively  flat 
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correlation  peaks.  Therefore,  the  location  of  the  correlation  maximum  is 
very  sensitive  in  response  to  relatively  small  amounts  of  noise.  Shadows 
czui  also  be  a  problem  for  such  matching  algorithms,  but  also  represent 
imnnrteunt  constraints  on  the  reconstructed  terrain. 

Another  radiometric  pathology  is  edge  migration.  This  situation  occurs  for 
the  direction  of  radar  illumination  just  grazing  the  terrain,  causing  an 
adarupt  tramsition  from  light  to  dark  grey  levels.  As  the  sensor  is 
trauislated  in  the  direction  of  this  terrain  element,  the  position  of  the 
grazing  temgent  moves  away  from  the  sensor. 

Matching  processes  depend  on  such  grey  level  transitions  for 
correspondence,  either  explicitly  or  implicitly.  Edge-based  matching 
algorithms  will  match  such  edges  incorrectly.  Area-based  correlation 
methods  will  be  affected  by  this  effect  in  proportion  to  their  sensitivity 
to  the  higher  spatial  frequencies  corresponding  to  the  affected  grey  level 
edge  amd  its  size  relative  to  the  size  of  the  matching  window.  It  is 
interesting  to  note  that  the  direction  of  the  edge  migration  effect  is 
opposite  that  of  the  parallax  direction  due  to  relief  displacement. 

Even  after  adjusting  for  differing  spatial  resolutions,  the  matching  of 
dissimilar  imagery  is  difficult.  This  is  because  what  is  desired  is 
finding  the  correspondences  of  similar  object  features  whose  image 
signatures  are  not  always  similar.  Some  of  the  geometric  and  radiometric 
factors  which  contribute  to  this  difficulty  are  discussed  below. 

SAR  and  optical  imagery  are  formed  by  sensing  at  different  wavelengths. 
Therefore  the  interactions  of  the  different  wavelengths  and  scene 
roughness  result  in  different  patterns  of  grey  level  textures.  This 
pathology  is  a  resuH  of  the  wavelength  dependence  of  the  scatttering 
cross  section  coefficient  oTg . 

However,  there  are  other  dependencies  of  ffg  that  are  important  at 
microwave  frequencies  but  not  at  optical.  The  discontinuity  of  the 
dielectric  constant  is  very  inportant  for  the  strength  of  radar  return 
from  an  object,  but  not  at  optical  wavelengths.  This  is  why  mositure  from 
a  recent  rainfall  will  change  the  SAR  signature  of  a  cornfield  more  them 
in  the  optical  case. 

Moreover,  the  SAR  is  an  active,  coherently  imaging  instrument,  vdiereas 
optical  sensors,  with  the  exception  of  lasers,  are  passive.  Therefore, 
SAR  is  subject  to  coherence- induced  speckle,  modeled  as  multiplicative 
noise,  whereas  optical  images  generally  are  subject  to  additive  noise. 

One  major  problem  is  the  greater  radiometric  sensitivity  of  a  SAR  compared 
to  a  passive  optical  instrument  when  imaging  the  same  scene  under 
different  viewing  scenarios. 

Low  contrast  regions  with  superimposed  speckle  noise  and  specular  effects 
Ccin  cause  problems  for  area-based  matching  metrics.  In  such  cases,  the 
signal-to-noise  ratios  (S/N)  are  reduced.  In  the  case  of  correlation 
metrics,  such  situations  often  result  in  broad,  relatively  flat 
correlation  peaks.  Therefore,  the  location  of  the  correlation  maximum  is 
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very  sensitive  in  response  to  relatively  small  amounts  of  noise.  Shadows 
can  also  be  a  problem  for  such  matching  algorithms,  but  also  represent 
important  constraints  on  the  reconstructed  terrain. 

Certain  radiometric  pathologies  have  been  observed  in  SAR  images  of  lava 
flows,  forests,  and  sand  dunes  [Blom,88].  These  features  have  been 
observed  to  chaunge  appearamce  dramatically,  even  sometimes  disappearing 
con^jletely,  with  variations  in  look  euigle  and  wavelength.  Subsurface 
imaging  in  some  cases  appears  to  be  involved.  Some  statistical  analyses  of 
scatterraneter  data  seem  to  indicate  that  discrimination  between  lava  flows 
is  better  when  look  angles  are  less  than  about  35  degrees 
[Blom,85,86,87a] .  The  same  type  of  smaller  angle  effect  has  been  observed 
for  semd  dunes  [Blom,81,82a,82b,87b] . 

The  mechanisms  for  such  behavior  are  not  completely  understood  at  present, 
amd  can  therefore  not  be  reliably  predicted.  The  inplication  for  image 
matching  is  that  sometimes  certain  features  amd  textures  visible  in  aui 
optical  image  of  the  same  resolution  may  be  missing  in  a  SAR  image  of  the 
same  scene  even  though  they  are  not  geometrically  occluded. 

Another  problem  is  that  terrain-induced  geometric  distortions  differ  for 
both  types  of  sensors. 

Relative  geometric  distortions  between  two  stereo  images  will  be 
accentuated  by  rapidly  varying  terrain  relief,  resulting  in  compression 
and  stretching  of  •  corresponding  regions  auid  distortions  of  grey  level 
boundaries.  Geometric  compression  actually  involves  the  accumulation  of 
returns  from  multiple  terrain  resolution  elements  into  the  same  range 
cell.  Because  of  this  accumulation  of  returns,  the  radiometric  value  of 
the  affected  pixels  c^n  vary  widely  in  one  image  from  the  corresponding 
pixels  in  the  other  image. 

As  pointed  out  in  [Ramapriyan,86] ,  the  requirement  for  well-conditioned 
stereo  intersections  aggravates  these  problems. 

An  extreme  example  of  discontinuous  distortion  includes  the  phenomenon  of 
overlay.  The  layover  effect  results  in  certain  objects  vdiich  are 
planimetrically  more  distcint  from  the  sensor  being  imaged  with  a  smaller 
reuige  than  nearby  objects  pleinimetrically  closer  to  the  sensor.  This 
occurs  because  of  very  cibrupt  treuisitions  in  height,  as  for  exanple  with 
flagpoles.  Such  discontinuities  will  almost  always  cause  errors  in 
automated  matching,  unless  the  affected  feature  is  very  small  conpared  to 
the  matching  window  or  specialized  logic  for  detection  of  overlays  is 
used. 

3.1.2  Overview  of  Area-Based  Matching 

Template  matching  using  any  of  the  various  forms  of  the  correlation  metric 
has  historically  been  a  useful  method  for  image-image  matching.  This 
method  of  matching  is  generally  region-based  rather  than  boundary-based, 
although  it  can  also  be  used  with  some  degree  of  success  for  matching 
high-pass  filtered  data.  Another  area-based  metric  is  the  sum  of  absolute 
differences  [Barnea,72]. 
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One  classification  scheme  of  image  translation  estimation  methods  appears 
in  [Huang,81],  and  consists  of  :  Fourier-based,  matching,  2md 
differential. 

Differential  methods  apply  to  image  pairs  with  small  amounts  of  relative 
translation,  such  as  successive  video  frames.  Such  methods  are  based  on 
Taylor's  expansion  for  two  variables  tnincated  to  linear  terms.  Obviously, 
such  cui  approach  is  not  applicable  to  this  registration  problem,  and  will 
not  be  discussed  here. 

Matching-type  methods  refer  to  the  use  of  correlation-based  methods  as  a 
measure  of  similarity  for  trial  areas  of  overlap  between  image  pairs.  The 
use  of  such  methods  will  be  discussed  in  sections  3.2,  3.3,  4.  and  5. 

Fourier-based  methods  refer  to  an  explicit  use  of  the  Fourier  Shift 
Theorem  [Champeney,73] ,  which  states  that  if  two  integrable  continuous 
functions  f(x,y)  and  gU,y)  are  related  as:  g(x,y)  -  f(x  +  Ax,  y  +  Ay) 
then  their  Fourier  Trcinsforms  are  related  as: 

F(u,v)  -  G(u,v)  exp{  -j  2n  (u  Ax  +  V  Ay)} 

In  practice,  such  methods  reduce  to  phase  correlation. 

A  classical  method  for  registering  images  is  to  succesively  register 
corresponding  patches  within  the  two  images  using  an  emalytic  metric  for 
coin)arison.  For  each  such  patch  in  one  image,  candidate  trial  patches  in 
the  second  image  are  cort^red  using  this  metric.  The  trial  patch  in  the 
second  image  which  optimizes  the  metric  is  chosen  as  the  "matching"  patch 
for  the  given  patch  in  the  first  image. 

Historically,  the  normalized  correlation  coefficient  has  been  the 
preferred  metric,  although  other  less  robust  versions  of  correlation-type 
metrics  have  been  used  because  of  computational  advantages,  such  as  non- 
normalized  correlation,  or  variations  on  the  sixn  of  absolute  difference 
metrics.  These  three  are  given  below: 

Normalized  Correlation: 
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Here  F(j-m,k-n)  refers 
Correlation: 

N  N 

R(m,n)  =  E  E 

j=l  k=l 


to  an  offset  of  (m,n) 

F  {j,k)  F  (j-m,k-n) 


with  respect  to  F(j,k). 


36 


Slim  of  Absolute  Differences: 


R(m,n,)  «  Z  E  |F  (j,k)  -  F  (j-m,k-n)| 

j-l  k-l 


Normalized  correlation  is  the  preferred  method  whenever  conputational  cost 
is  not  an  issue.  The  value  of  the  normalized  correlation  coefficient  is 
bounded  absolutely,  eund  always  lies  between  -1  2Uid  +1.  Therefore,  absolute 
values  of  this  metric  close  to  +1  achieved  for  trial  offfsets  are  close  to 
local  maxima  in  the  correlation  surface.  The  normalized  metric  can  allow 
for  a  constant  multiplier  between  the  gray-values  for  corresponding  pixels 
in  the  two  images.  Also,  the  normalized  correlation  coefficient 
theoretically  achieves  its  meucimum  value,  in  the  absense  of  noise  auid 
nonlinearities,  at  the  correct  offset  between  the  two  images. 

This  is  not  always  the  case  for  non-normalized  correlation  algorithms, 
lidiich  can  achieve  high  correlation  values  at  some  incorrect  offsets  singly 
because  of  high  gray-values  at  certain  localities.  The  sum  of  absolute 
values  of  differences  generally  performs  better,  in  the  sense  of 
acceptable  accuracy,  than  the  non-normalized  correlation  approach,  but 
also  not  as  well  as  normalized  correlation  (Svedlow,77] . 

Correlation  is  conpatationally  expensive.  One  route  toward  reducing  the 
amount  of  computation  has  been  to  perform  the  equivalent  operation  in  the 
Fourier  domain,  using  the  Fourier  Convolution  Theorem  (Champeney,73] .  The 
basis  of  computational  savings  using  the  FFT  is  that  the  latter  requires  a 
computation  asymptotically  proportional  to  N  log  N  rather  than  N  ,  vrtiere  N 
is  the  dimension  of  a  square  image.  For  relatively  large  images,  there  can 
be  considerable  savings  using  the  FFT  approach.  Up  to  an  order  of 
magnitude  or  more  efficiency  can  sometimes  be  achieved,  depending  on  the 
relative  sizes  of  the  convolution  block  L  and  image  size  N  [Pratt, 78]. 

However,  this  approach  does  not  deal  with  the  observation  that  the 
determination  whether  a  certain  offset  is  very  incorrect  should  sanehow 
require  less  computation  than  the  determination  of  a  correct  offset.  This 
view  is  at  the  root  of  the  idea  of  recursively  searching  for  the  correct 
offset  in  a  "pyramid"  of  increasing-resolution  versions  of  the  image  pairs 
[Rosenfeld,84] . 

This  point-of-view  is  also  the  basis  of  the  sequential  aj^roach  using  the 
sum  of  eibsolute  differences  metric  in  [Barnea,72].  Here,  for  emy  trial 
offset,  whenever  a  pre-determined  threshold  has  been  exceeded  before  the 
entire  sum  has  been  evaluated,  the  summation  is  suspended  and  a  new  trial 
offset  is  evaluated. 

Hybrid  algorithms  usihg  this  approach  for  a  rough  estimate  of  offset, 
followed  by  a  version  of  normalized  correlation  using  statistical  pre¬ 
processing  have  been  suggested  (Pratt, 73).  Clearly,  this  idea  can  be 
generalized  to  the  use  of  other  robust  metrics  following  the  initial  rough 
estimate. 
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Another  problem  with  correlation  measures  in  general,  including  normalized 
correlation,  includes  the  broad,  flat  nature  of  the  peaUt  regions  in  the 
correlation  surface.  This  characteristic  has  advemtages  emd  disadvantages. 
An  advantage  of  broad  peaks  is  their  "pull-in"  range  for  search  techniques 
that  don't  sample  everywhere  or  that  employ  averaging,  as  in  multi- 
resolution  processing  [Rosenfeld,84] . 

A  disadvantage  is  that  registration  accuracy  can  be  coiifjromised  whenever 
peaks  are  not  sharp,  cind  smaller  perturbations  due  to  noise  ceui 
potentially  have  greater  effects  on  accuracy. 

This  broad  peak  characteristic  of  correlation  techniques  occurs  because 
spatial  relationships  in  images  are  ignored  for  the  most  part.  Instead, 
the  values  of  the  correlation  metric  are  really  related  to  energies  of  the 
broad  areas  within  the  images,  i.e.  the  energy  content  of  the  low- 
frequency  portions  of  the  images.  Since  phase  shifts  of  these  lower 
frequencies  can  be  relatively  large  con^^red  to  the  pixel  resolution  and 
still  be  relatively  small  compared  to  the  corresponding  low-frequency 
wavelengths,  there  is  a  lack  of  sensitivity  in  correlation  metrics  to 
phase  shifts.  Therefore,  correlation  surfaces  tend  to  have  broad  peaks. 

One  approach  toward  solving  this  problem  has  involved  the  preferential  use 
of  phase  information  in  the  images,  "nie  idea  here  is  that  the  Fourier 
phase  content  of  the  images  contains  more  accurate  information  than  the 
low-frequencies  vrtiich  dominate  correlation  metrics.  This  is  the  basis  for 
phase  correlation  methods  (Kuglin,75,79] ,  (Pearson, 77] ,  [DeCastro,87] . 

Using  the  previous  exanple  of  the  functions  f(x,y)  and  g(x,y)  related  by 
translations  Ax  and  Ay,  the  inverse  Fourier  transform  of  {F(u,v)  / 
G(u,v)}  is  just  the  inverse  transform  of  the  phase  term,  and  thus  is  equal 
to  the  Dirac  Delta  Distribution  evaluated  at  Ax  and  Ay,  i.e.  S(Ax, 
Ay) 

Now,  because  of  the  effects  of  sampling  and  finite  image  sizes,  sidelobes 
occur  in  addition  to  a  main  peak.  Therefore,  in  practice,  this  technique 
reduces  to  correlation  in  the  Fourier  phase  domain. 

Such  methods  are  also  capedale  of  subpixel  accuracy  with  the  use  of 
interpolation,  but  suffer  from  the  problem  of  potentially  high  sidelobes. 
Such  a  correlation  surface,  with  sharp  peaks  and  high  sidelobes,  does  not 
lend  itself  easily  to  hierarchical  processing  with  reduced  resolutions 
because  of  the  narrow  "pull-in"  range  of  the  main  lobe.  However,  used  in 
conjunction  with  other  methods  which  can  acquire  the  main  lobe,  phase 
correlation  can  be  a  useful  method  for  refining  initial  offset  estimates 
to  subpixel  accuracy. 

Another  approach  to  rectify  this  broad  peak  problem  has  been  lo 
concentrate  on  the  edge  content  in  the  images.  This  approach  has  both 
intuitive  appeal  as  well  as  some  theoretical  justification. 

Intuitively,  it  would  seem  that  the  "meaningful"  information  in  an  image 
lies  at  the  locations  of  large,  structurally  significeint  contours  and 
boundaries.  Particularly  in  SAR  images,  the  smaller  edges  are  more  often 
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due  to  noise,  speckle,  and  imaging  effects.  However,  the  larger  edges  are 
due  to  terrain  and  thematic  value  changes.  It  would  seem  that  it  is  these 
edges  and  boundaries  that  should  be  involved  in  a  registration  scheme. 

This  notion  has  been  explored  on  a  more  analytical  basis  from  the 
standpoint  of  "optimally"  filtering  the  images  as  a  pre-processing  step 
prior  to  registration.  One  approach  has  been  to  decorrelate  the  images  by 
applying  a  "whitening"  filter  [Pratt, 73],  (Svedlow,78) .  These  methods 
essentially  differ  in  their  assun^Jtions  concerning  image  structure  and 
statistical  properties.  The  conclusions  of  these,  works  point  toward  the 
use  of  pre-processing  filters  vhich  can  be  approximated,  lander  certain 
assunptions,  by  gradient  filters  [ Svedlow, 78 ] ,  or  Laplacieui  filters 
[Pratt, 73] . 

Of  course,  there  are  some  problems  associated  with  this  approach  also.  The 
high-frequency  edge  content  of  an  image  is  relatively  small  cou^red  to 
the  total  area  of  ein  image.  Edges  can  be  broken  up  slightly  differently  in 
two  images  which  otherwise  contain  no  other  perturbations.  Edges  may  also 
have  slight  variations  in  thickness.  Therefore,  such  perturbations  can 
lead  to  misalignment  sensitivities  for  algorithms,  like  edge  correlation, 
which  examine  the  degree  of  "match"  in  overlapping  edge  images.  One 
approach  is  to  condition  the  edges  in  both  images.  This  would  include 
normalizing  their  intensities  and  broadening  them.  However,  broadening 
edges  caui  lead  to  reduction  of  registration  accuracy. 

These  considerations  mentioned  above  suggest  that  correlation-based 
matching  involves  consideradile  searching  eund  is  conputationally  expensive, 
imless  the  hierarchical  "coarse-to-fine"  approach  previously  discused  is 
used.  The  capabilities  of  correlation  algorithms  for  achieving  high 
accuracies  often  seems  to  require  some  form  of  pre-filtering  to  enhance 
edge  content.  However,  certain  potential  instabilities  are  involved  with 
the  use  of  edge  correlation. 

The  use  of  correlation,  however,  assxiroes  that  there  is  little  nonlinear 
geometric  or  radiometric  distortion  between  the  tenplate  and  the  actual 
imaged  feature  instance  [Lahart,70]. 

3.1.3  Overview  of  Boundary-Based  Matching  Methods 

One  method  of  boundary  matching  in  binarized  images  \^ich  has  achieved 
some  success  is  "chamfer  matching"  [Barrows,  78).  This  method  uses  a 
distance  array  between  features  in  patches  being  matched,  amd  estimates 
the  translations  between  patches  by  searching  for  those  offsets  vrtiich 
minimize  the  distance  array  sums.  This  technique  generally  requires 
cleaner  extraction  of  edges  than  does  edge  correlation.  As  is  the  case 
with  all  edge-based  matching  techniques,  this  metric  is  more  sensitive  to 
perturbations  perpendicular  to  lines  in  the  image  than  parallel  to  such 
lines. 

One  method  developed  at  VEXCEL  (McConnell, 87]  performs  matching  for 
translation  estimation  by  creating  binarized  images  from  Marr-Hildreth 
operator  zero  crossings.  This  method  has  the  advantage  of  accurate 
computation  of  edge  content  using  this  second-order  operator,  as  well  as 
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the  steibility  of  regional  matching.  This  method  is  especially  well-suited 
for  registering  opposite-side  satellite  SAR  images. 

Another  method  was  developed  at  VEXCEL  for  registering  ice  floes  in  arctic 
SAR  imagery.  This  method  performed  boundary  matching  using  dynamic 
programming,  and  reduced  a  two-dimensional  search  problem  to  one  involving 
one-dimension.  This  method  employed  the  psi-s  representation  of  closed 
contours  [Ballard, 82] ,  vhich  allowed  a  convenient,  representation  for 
trzmslation  emd  rotation.  The  particular  implementation  of  the  dynamic 
programming  algorithm  [Sankoff,83]  was  that  used  for  string  search  and 
other  sequence  comparisons.  The  assimption  was  that  the  ice  floes  are 
rigidly  rotated  and  translated,  but  some  very  localized  distortions  such 
as  expauision,  contraction,  insertion  and  deletion  could  be  tolerated. 

Graph  matching  is  a  method  vhich  incorporates  topological  relationships 
into  the  matching  process  without  undue  emphasis  on  "exact"  metric 
correspondence.  This  method  has  found  considerable  use  in  image-image 
matching,  for  example  [Price,  82],  [ Ballard, 82 ] ,  [Nevatia,  82], 
[Shapiro, 81] ,  [Ayache,87].  Such  problems  involve  subgraph  isomorphism  and 
can  potentially  be  complex.  Although  graph  matching  problems  belong  to  the 
worst-case  computationally  intractable  class  NP-complete  [Aho,74],  the  use 
of  heuristics  to  reduce  the  search  has  been  successful. 

In  fact,  all  computationally  successful  graph  matching  algorithms  imust 
somehow  use  some  tasans  of  distinguishing  salient  features  in  order  to 
avoid  the  problem  of  combinatorial  explosion  when  searching. 

For  example,  in  [Medioni,84] ,  graph  matching  was  used  for  both  image- image 
and  image-map  matching.  The  computational  complexity  of  matching  was 
reduced  by  the  use  of  a  "coarse  to  fine"  matching  strategy  vdiich  extracted 
isolated,  long  edges  first,  and  matched  these  to  the  model  at  low 
resolution  using  relaxation. 

Other  boundary-oriented  graph  matching  techniques  have  emerged  from  the 
field  of  robot  vision.  Typical  of  the  problems  encountered  in  this  field 
is  to  correctly  identify  the  individual  parts  in  an  imaged  pile  of  parts 
which  may  be  overlapping  and  are  partially  obscured  by  each  other.  Since 
the  surfaces  of  these  parts  are  often  smooth,  the  boundaries  contain  most 
of  the  identifying  information. 

An  example  of  such  work  is  [Bolles,82].  Significant  local  features  are 
identified,  such  as  corners  and  holes.  These  features  are  then  clustered, 
and  matching  proceeds  "cluster  to  cluster".  The  algorithm  attempts  to  find 
the  most  significant  cluster.  A  graph  matching  formulation  is  created 
wherein  the  nodes  are  matches  of  image  and  model  features,  and  edges  are 
pair-wise  assignments  between  nodes.  The  match  problem  is  then  equivalent 
to  searching  for  maximal  cliques,  which  has  complexity  NP-complete.  This 
method  assumes  that  features  will  be  clustered  closely  together. 

Another  method  is  that  found  in  [Ayache,84].  Here,  polygonal 
approximations  of  extracted  boundaries  and  model  sides  are  matched  using  a 
strategy  of  matching  the  longest  sides  first,  as  is  done  in  [Medioni,84] . 
Lengths  of  sides  and  corner  angles  are  used  in  measures  of 
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"compatibility".  This  method  also  attempts  to  generate  "hypotheses"  and 
continues  until  a  sufficient  nvimber  of  hypotheses  are  evaluated  and  a  good 
match  score  has  been  obtained. 

The  method  of  [Turney, 85]  divides  a  tenplate  of  length  n  into  n/2 
subteii5)lates  of  length  h.  Every  other  pixel  on  the  boundary  of  the 
tenplate  starts  a  subtemplate.  Each  siabtemplate  of  each  model  object  is 
then  compared  to  the  extracted  object  boundary  in  the  sensed  image. 

The  boundaries  are  represented  using  a  variation  of  the  psi-s 
representation  mentioned  previously.  The  matching  metric  is  the  least- 
squares  measure  of  the  differences.  The  method  attempts  to  reduce  the 
combinatorics  of  the  comparisons  by  selecting  "most  salient"  subtemplates. 
The  sucessful  matching  of  such  salient  siibten^lates  is  weighted  more  than 
other  matches.  Again,  this  algorithm  is  slow. 

The  spirit  of  this  last  method,  however,  is  carried  out  with  significant 
computational  in^jrovements  in  [ Schwarz, 85 ] .  Moreover,  [Kalvin,86] 
continues  con^tational  inprovements  on  this  approach  by  employing  the 
clever  idea  of  "geometric  hashing".  This  hashing  is  accomplished  by  the 
use  of  a  "footprint"  of  2un  object's  bomdary.  Only  a  small  number  of 
object  footprints  are  retrieved  for  potential  matching  checks.  This  number 
of  retrieved  footprints  does  not  depend  strongly  on  the  number  of  objects 
in  the  model  dateisase. 

A  footprint  is  a  crude  geometric  characterization  of  an  object  boundary 
using  5  dimensions.  A  footprint  is  generated  by  a  mapping  that  is  not 
necessarily  1-1,  but  satisfies  invariance  xmder  translation  and  rotation, 
continuity  is  essentially  preserved  by  having  locally  similar  objects  into 
similar  footprints.  Tta  five  dimensional  representation  of  each  point  of 
the  object's  boundary  involves  the  first  four  Fourier  coeficients  of  the 
boundary  and  the  turning  angle  of  a  polygonal  approximation  at  that 
boundary  point. 

The  hashing  occurs  because  5-D  space  is  divided  into  hypercubes  of  fixed 
size,  and  for  each  hypercube  there  is  a  list  of  all  models  whose 
footprints  lie  in  that  hypercube.  In  this  way,  searching  for  potential 
matches  of  boundary  segments  is  considerably  reduced. 

Another  approach  is  the  method  of  invariant  central  mcMnents.  The  use  of 
this  method  as  a  theoretical  pattern  recognition  invariauat  under 
translation,  scaling,  and  rotation  first  appears  in  [Hu, 62],  auid  was  used 
in  [Wong, 80]  for  SAR-optical  matching.  The  latter  approach  essentially 
used  orthogonal  expansions  via  the  first  seven  invariant  moments  of  a 
radar  sub-image  and  a  trial  offset  window  in  the  optical  window  as  the 
components  of  feature  vectors  to  be  correlated; 

R(x,y)  »  {i:LiM.Ni(x,y)}/{Z’^^MfEj^iN2(x,y)} 

vhete:  M- =  i*^**  invariant  moment  of  the  target  SAR  sub-image 

N^(x,y)  =  i*^^  invariant  moment  of  the  optical  window  at  trial 
offset  (x,y) 
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Now,  ordinary  moments  are  defined  as: 


“Vq”  ;!.f(x,y)xPy5dxdy 

where:  f(x,y)  -  image  grey  value  at  (x,y) 

These  ordinary  moments  were  used  in  {Hu, 62)  to  define  central  moments, 
which  were  then  used  to  define  the  seven  invariant  moment  fxanctions  used 
above,  :  i-l,...,7.  A  disadvantage  of  using  the  higher  invariant 
moments  is  their  instcibility  with  respect  to  noise  euid  discretization 
effects,  as  well  as  integration  errors.  Moreover,  moments  are  not 
contrast  invariant  [Maitra,79]. 

Hierarchical  matching  using  a  resolution  pyramid  can  decrease  the  amount 
of  conpjtation  considerably  if  the  in^rtant  features  do  not  disappear 
when  the  resolution  is  decreased. 

3.2  Methods  Used  in  Present  Effort 

Three  types  of  approximate  registration  methods  are  discussed  in  this 
section:  area-based,  contour-based,  eind  combined.  These  are  discussed  in 
the  following  3  sub-sectionr . 

3.2.1  Area-Based 

Two  algorithms  were  formulated  and  tested  for  estimating  pixel  offsets  of 
an  optical  chip  within  a  SAR  image.  Both  of  these  methods  assumed  only 
translational  differences  between  a  small  optical  chip  and  the 
corresponding  region  in  the  SAR  image.  No  appreciable  terrain  distortions 
were  present. 

One  method  involved  the  Karhunen-Loeve  Transform  (K-L),  and  the  other 
concerned  the  use  of  the  Maximum  Noise  Treuisforroation  (MNF).  These  two 
methods  are  detailed  in  the  next  two  sections,  3. 2. 1.1  emd  3. 2. 1.2. 

3. 2. 1.1  Translation  Estimation  Using  the  Karhunen-Loeve  Tremsform 

The  following  intensity  treinsformation  for  SAR  images  for  use  in  matching 
with  an  optical  chip  is  based  on  the  Karhunen-Loeve  (K-L)  Transformation. 
This  orthogonal  transformation  has  been  used  previously  as  a  pre¬ 
processing  step  for  both  noise  removal,  and  for  intensity  re-mapping  a  SAR 
image  on  the  basis  of  intensities  in  an  optical  image  with  which  the  SAR 
image  has  been  registered  previously. 

However,  the  K-L  transformation  was  used  in  the  following  way  for 
registering  a  SAR  image  aind  an  optical  chip: 

A  chip  in  the  SAR  image  was  re-mapped  using  two  distinct  intensity  re¬ 
mapping  functions,  one  of  which  was  registration  independent  while  the 
other  was  registration  dependent.  The  registration  independent  method 
used  involved  histogram  equalization  [Castleman,79] .  The  registration 
dependent  method  used  the  eigenvector  of  the  maximum  eigenvalue  of  the  K-L 


transformation  of  the  covariance  matrix  of  the  pixel  co-occurrence  values 
for  a  given  trial  registration. 

Therefore,  for  each  trial  offset  of  the  optical  chip  within  the  SAR  image, 
the  two  resulting  intensity  re-mapped  versions  of  the  corresponding  SAR 
chif  were  evaluated  using  normalized  correlation.  The  idea  was  that  at 
the  correct  offset,  there  should  be  a  greater  similarity  between  these 
versions  than  at  other  offsets. 

This  hypothesis  proved  to  be  generally  correct  only  in  a  local  sense,  ie. 
the  correct  offset  was  usually  found  to  be  associated  with  a  local  mzucimum 
of  the  resulting  correlation  surface.  However,  it  was  observed  that  at 
those  "correct"  local  maxima  which  were  not  also  global  maxima,  there  was 
a  much  more  dreiinatic  local  chzmge  in  the  mean  intensity  levels  of  the  K-L 
re-mapped  SAR  image.  These  results  are  discussed  in  [Curlander  and 
Kober,89].  Iherefore,  the  combination  of  these  two  criteria  is  a 
promising  method  for  further  investigation. 

This  algorithm  is  briefly  described  below: 

For  a  trial  registration  offset  of  the  optical  chip  in  the  SAR  image, 
compute : 

o  Form  the  (2x2)  con^site  covariance  matrix  K  corresponding  to  a  sum 
of  the  covariamce  matrices  for  each  pixel  pairing  in  the  window. 
Here  K  is  given  by: 

K-  (1/N) 

where  m  -  (l/N)I‘:’\.j^fj^ 

fi-  (f<i' ,f<i' i-l,...,N 
i  -  i^**  registered  pixel,  i  ■■1,...,N 
f*^’-  grey  value  of  i^^  registered  pixel  in  image  1 
fU)«  grey  value  of  i^**  registered  pixel  in  image  2 
o  Collate  the  maximum  eigenvalue  X  of  K  cuid  its  eigenvector  (Xj  iXj  : 
X-  {a+d+[(a-d)2+4b2  ]i/2}/2 


Lx,  J 


b/[b2+(X-a)2 
L(X-a)/[b2+(X-a)2 


where  the  coefficients  a,b,d  are  entries  of  the  matrix  K: 


K 


a  b" 
.b  d. 
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o  For  each  SAR  pixel  in  the  trial  offset  window,  conpute  a  new  grey 
value  : 

9i- 

o  Compute  the  nonnalired  cross  correlation  of  the  intensity  remapped 
SAR  window  with  the  same  patch  in  the  SAR  image  which  has  been 
intensity  remapped  using  histogram  equalization. 

3. 2. 1.2  Translation  Estimation  Using  the  Maximum  Noise  Fraction 

Transformation 

Another  intensity  treinsformation  in  the  spirit  of  orthogonal  deconposition 
is  the  Maximum  Noise  Fraction  Transformation  (MNF)  [Green  et  al,88].  This 
trzuis format ion  is  a  generalization  of  the  K-L  {or  prinicipal  conponents) 
treuisformation  for  multi-spectral  imagery. 

The  conventional  wisdom  regarding  the  K-L  traunsformation  of  a  multi- 
spectral  data  set  is  that  it  can  be  used  to  provide  a  sequence  of 
"eigenimages"  of  decreasing  S/N,  and  that  the  first  two  or  three 
incorporate  most  of  the  inportant  information  in  the  dataset.  In  fact,  it 
is  shown  in  (Green  et  al,88]  that  this  is  true  only  if  the  noise 
components  in  all  of  the  bands  are  uncorrelated  auid  of  equal  variance.  An 
example  is  given  vdiereby  the  K-L  transformation  specifically  does  not 
provide  an  optimal  ordering  for  image  quality  in  the  case  airborne 
thematic  mapper  (ATM)  simulator  data. 

However,  the  MNF  transformation  is  specifically  derived  to  achieve 
increasing  image  quality,  as  measured  by  S/N,  for  decreasing  component 
number.  The  MNF  transformation  can  be  used  to  order  a  new  sequence  of 
images  into  increasing  S/N  quality,  so  that  the  lowest  quality  images  can 
be  subjected  to  the  most  intense  averaging  procedures  of  noise  removal. 
The  MNF  traunsformation  is  then  inverted  to  return  to  the  original  mult- 
band  data  set,  but  vdnich  has  now  been  noise  filtered. 

In  the  special  case  wherein  all  of  the  noise  resides  in  one  band  only, 
there  is  a  special  solution  to  this  procedure.  This  is  approximately  true 
in  the  case  of  an  optical  chip  and  a  SAR  image,  since  the  SAR  image 
suffers  from  a  great  deal  of  speckle. 

Therefore,  in  the  spirit  of  the  previous  algorithm  in  section  3. 3. 3. 1.1, 
this  transformation  was  applied  to  the  SAR  image  for  each  trial  offset  of 
the  optical  chip,  followed  by  normalized  correlation.  The  results, 
discussed  in  [Curlander  and  Kober,89],  were  slightly  better  than  for  the 
K-L  re-mapping  method. 

This  algorithm  is  briefly  described  below: 
o  Set  Zj(x)  *  logjo  R(x) 

where  R(x)  =  grey  value  in  SAR  image  at  pixel  x 
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o  Set  22(x)  -  grey  value  in  optical  image  at  pixel  x 
o  For  each  offset  of  optical  window  in  SAR  image,  form; 


o 

o 


- 

- 

-  Z 


joean  value  of  z^(x)  in  window 
meaui  value  of  Zj  ( x )  in  window 

the  covariance  matrix  of  the  pixel  co-occurences 


Set  z*(x)  -  1%-  (v^jAii  )(22  ) 

Form  new  re-mapped  SAR  window  consisting  of  lO^^^  for  each  pixel,  \diere 


y  -  z*  (x) 

o  As  an  alternative  to  the  exponentiation  step  above,  the  use  of 
histogram  hype rbolizat ion  [Frei,77]  was  used.  This  procedure  is 
briefly  described  below. 

Histogram  hyperbolization  represents  a  modification  to  the  idea  of 
histogram  equalization  for  contrast  enhancement  which  takes  the  human 
visual  system  into  account.  The  idea  is  to  create  a  uniform  distribution 
of  "perceived"  intensity  values,  as  distinct  from  a  uniform  distribution 
of  actual  values  (histogram  equalization).  Empirically,  the  use  of  this 
procedure  on  logarithmically  processed  imagery  has  resulted  in  better 
contrast  enhancement  yian  with  re-exponentiation. 


A  logarithmic  response  to  image  intensities,  corresponding  to  perceived 
intensities,  cem  be  approximately  modeled  as: 


B  -  log(J(I)+c) 

where:  I  -  intensity  value  prior  to  intensity  re-mapping 

J(I)-  intensity  value  after  intensity  re-mapping 


B-  perceived  brightness  value 

Equalization  of  perceived  brightness  values  leads  to  the  expression: 

J(I)  -  c(exp  [log(l+l/c)Fj I  -  1) 

where:  F^  =  cumulative  distribution  function  of  the  image  intensities 

prior  to  re-mapping. 

c  =  arbitrary  constant 


Another  direction  for  SAR-optical  image  matching  involves  concnetrating  on 
the  higher  frequency  portions  of  the  datasets.  Intuitively,  the  edge 
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contents  of  both  images  have  more  in  conimon  than  do  the  lower  frequency, 
broader  regions.  As  an  alternative  to  edge  e.^i^auicement  methods,  the 
following  (3x3)  spatial  de-correlation  filter  (Pratt, 78]  was  used: 

■p^  -p(l+p^)  p^ 

-p(l+p^)  (1+p^)^  -p(l+p^) 

.p2  -p(l+p^)  p^ 

This  filter  was  used  both  by  itself,  and  with  both  the  K-L  and  the  MNF  re¬ 
mapped  versions  of  a  SAR  chip  prior  to  normalized  correlation.  Results 
are  discussed  in  (Curlander  and  Kober,89]. 

3.2.2  Contour-Based 

Any  algorithm  that  segments  data  from  an  EO  image  to  cue  a  search  in  a  SAR 
image  must  take  into  account  the  differing  projections  for  both  images. 
Projection  effects,  due  to  the  2uigle  between  the  sensor  look  direction  and 
the  feature  as  well  as  the  sensor  depression  eungle,  are  important 
determinants  in  the  dissimilarity  between  an  corresponding  SAR  and  EO 
image  pair.  Such  projection  effects  are  less  of  a  problem  for  matching 
long  linear  features  than  shorter  areal  features. 

3. 2. 2.1  The  Profile  Method  For  Contour-Based  Matching 

One  new  algorithm  developed  at  VEXCEL  searches  for  long,  linear  features 
characterized  by  local  contrast  on  both  sides  of  the  feature.  The 
targeted  features  are  only  a  few  pixels  wide  and  are  loclly  brighter  or 
darker  than  a  surrounding  area  of  size  defined  by  the  user.  Essentially, 
the  algorithm  uses  a  suinned  profile  of  the  pixel  values  in  a  local  region 
as  shown  in  Fig.  3-1. 

In  a  small  window,  rows  and/or  columns'  of  pixel  values  are  simaned  to 
produce  a  profile.  Ideally,  any  horizontal  or  vertical  linear  feature 
that  is  only  a  few  pixels  wide  coit^red  to  the  beind  size  of  the  profile 
will  stand  out  by  having  a  sxrni  value  significantly  greater  than,  or  less 
than,  the  raeeua  value  of  the  profile.  This  idea  exploits  the  local 
contrast  of  the  edge. 

This  method  is  also  more  sensitive  than  nany  other  methods  of  edge 
segmentation  for  regions  with  narrow  edges,  ie.  having  a  thickness  of  one 
or  two  pixels,  superimposed  on  a  broad,  uniform  background.  The  method's 
robustness  to  noise  effects  is  improved  because  integration,  in  the  form 
of  adding  pixel  values,  is  a  more  stable  operation  than  the  usual 
differentiation  step  in  edge  segmentation  operations.  Searching 
preferentially  for  longer  horizontal  lines  than  vertical  lines  cnn  lie 
accomplished  by  using  a  local  region  mask  which  is  more  elongated  in  the 
horizontal  direction. 

Repeatedly  applying  this  method  to  multiple,  adjacent  windows  in  an  image 
results  in  more  complex  segementations.  Rows  and  columns  which  were 
flagged  as  outliers  in  their  local  profiles  are  compared  with  adjacent 
rows  and  columns  in  the  neighboring  windows.  If  the  adjacent  windows 
contain  flagged  pixels  in  the  adjacent  rows  and  columns,  those  pixels  are 
kept.  Otherwise,  the  flagged  pixels  in  the  original  window  are  discarded. 
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Fig.  3-1;  (a)  A  feature  of  interest  in  a  noisy  background,  (b)  window 

divided  for  local  summed  profiles  in  vertical  direction,  (c) 
edges  extracted  by  thresholding  profiles  in  each  block  using 
local  statistics,  (d)  edges  after  removal  of  discontinuous 
segments,  (e)  vector  representation  of  edge  contour. 
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feature  mask  at  trial  offset  in 
SAR  with  profile  blocks  to  be 
summed  together 


feature  in 
SAR  im.age 


^^9*  3—2;  Use  of  edge  contour  from  optical  iamge  to  cue  customized 
profiling  in  SAR  image.  (Note  when  the  slab  produced  by  the 
profile  is  laid  around  the  vector  contour  is  exactly  placed 
upon  the  real  contour  in  the  SAR  image,  the  profile  will 
display  a  peak  at  the  center). 
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slab  to  be  sum-profiled  at  various  offsets  in  the  SAR  imge. 
At  each  offset  any  outliers  in  the  profile  are  highlighted  and 
added  to  the  output  image  at  the  corresponding  locations 
indicated  by  the  peak  in  the  profile.  The  actual  locations  of 
the  contour  in  the  SAR  may  be  detected  by  several  offsets  of 
the  slab,  but  will  always  highlight  the  same  locations  in  the 
output  image.  The  vectorized  mask  can  then  be  correlated  with 
the  output  image  to  indicate  the  best  offset  in  the  SAR. 


In  this  algorithm,  it  is  in^rtant  to  ensure  that  vrtienever  a  potential 
edge  is  discarded,  it  cannot  cause  another  edge  in  an  adjacent  edge  to 
also  be  discarded  and  therby  cause  a  cascading  sequence  of  edge  discards. 

A  subsequent  process  Ceui  gather  the  flagged  segements  together  based  on  a 
closeness  threshold  for  the  adjacent  endpoints.  The  output  of  this 
process  vrould  be  a  series  of  linked  lists,  each  consisting  of  a  "streak" 
of  sone  length.  The  choice  of  vrfiich  streaks  to  use  for  matching  cam  be 
based  on  streak  length. 

A  similar  algorithm  is  applied  to  the  SAR  imagery,  with  suitable 
modifications  taking  into  account  the  characteristics  of  SAR.  Once  a 
rough  estimate  of  the  location  of  particular  feature  in  the  SAR  is  known, 
then  a  variety  of  methods  may  be  used  to  refine  this  initial  estimate. 

One  variation  of  the  aJxjve  profile  method  Is  now  described  (see  Fig.  3-2). 
This  algorithm  uses  the  same  summed  profile,  but  only  for  streaks 
generated  from  the  EO  imagery.  The  corresponding  streak  in  the  SAR  C2m 
lie  anywhere  within  a  region  of  uncertainty.  An  uncertainty  in  rotation 
is  also  present.  An  assun^tion  of  +3®  is  reasonably  consistent  with 
present  INS  drift  rates  and  mission  durations  of  a  few  hours. 

At  a  candidate  offset  location  within  the  SAR  image,  cued  by  some  streak, 
a  "total  profile"  is  formed  by  summing  successive  profiles  centered  at 
pixels  along  the  streak  with  respect  to  one  of  its  endpoints.  Next,  the 
total  profile  is  thresholded  at  N  standard  deviations  a£ove  and  below  the 
mean. 

Any  pixels  in  the  total  profile  vrtiich  are  identified  as  outliers  indicate 
streaks  parallel  to  the  currently  sought  streak  in  the  EO  image  at  an 
offset  corresponding  to  the  location  within  the  profile. 

This  procedure  is  repeated  for  trial  offsets  in  the  SAR  image  within  the 
region  determined  by  the  uncertainities  in  range  and  azimuth  (see  Fig.  3- 
3).  The  number  of  standard  deviations  associated  with  each  parallel 
streak  is  summed  into  the  output  image  to  produce  a  surface.  Foilwing 
these  conpitations  for  all  trial  offsets,  the  target  streak  from  the 
optical  image  is  correlated  with  this  derived  surface.  Because  of  the 
binary  nature  of  these  streaks,  a  sum  of  edjsolute  differences  metric  can 
be  used  as  the  correlation  metric. 

3. 2. 2. 2  "Beam  Thinning"-  A  Future  Contour-Based  Matching  Algorithm 

Another  matching  algorithm  was  also  developed  theoretically,  and  is 
intended  to  be  investigated  during  Phase  II  for  matching  intensity 
contours  which  have  been  distorted  by  terrain.  Of  course,  such  ter'ain- 
induced  distortions  will  be  different  for  SAR  and  optical  sensors. 
Therefore,  einy  successful  matching  technique  must  be  robust  with  respect 
to  metric  distortions  that  represent  differences  between  sensors,  while 
still  retaining  sensitivity  to  similarities  arising  from  the  common 
terrain  imaged. 
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The  algorithm  described  briefly  below  represents  eui  algorithm  which 
essentially  can  represent  the  local  properties  of  a  curve  vdiich  are  more 
"topological",  as  well  as  metric  descriptions.  In  particular,  it  can 
adequately  represent  the  regions  of  changing  curvature  at  various  scales 
of  resolution. 

It  is  believed  that  curve  descriptions  which  contain  qualitative 
information  such  as  the  sequences  of  positive  and  negative  curvature, 
aside  from  their  magnitude,  will  be  importemt  for  matting  images  with 
sensor-dependent  distortion.  This  is  because  the  sensor  type  wil 
generally  modify  the  magnitudes  of  such  local  curvatures,  but  will  not 
chemge  their  sign  patterns. 

This  curve  representation  method,  called  "beam  thinning",  works  using  a 
hierarchical  description  of  a  curve  in  which  increasing  detail  appears 
with  increasing  depth  of  the  describing  graph  structure.  The  hierarchical 
description  for  extracted  curves  is  the  following: 

i^proximate  the  curve  by  a  "sladD"  of  thickness  d^  as  shown  in  Fig.  3-7. 
For  each  thickness  d^ ,  the  curve  is  easily  partitioned  into  convex  and 
concave  regions.  For  each  such  concave/convex  portion,  there  is  a  node  in 
the  descriptor  graph.  Such  a  node  contains  descriptive  information  for 
that  portion  of  the  curve,  such  as  the  curvature  emd  the  length. 

As  the  thickness  of  the  approximating  slab  decreases,  each  convex/concave 
region  caui  potentially  split  into  concave/convex  subregions.  Hiis  can  be 
expressed  in  graph  theoretic  form  by  forming  subnodes  under  each  of  the 
nodes.  In  this  way,  there  is  a  description  of  the  curve  for  a  sequence  of 
slab  thicknesses  {d„}  by  means  of  a  graph  vAiose  node  levels  correspond  to 
the  {d„ } .  The  ultimate  level  of  descriptive  detail  corresponds  to  a  slab 
thickness  of  1  pixel  (see  Fig.  3-4). 

Such  a  description  allows  a  descending  sequence  of  approximate  curve 
matchings  to  occur  based  on  a  succession  of  increasingly  detailed  trends 
in  the  curve  description.  This  is  one  way  to  deal  with  the  problem  of 
insertions  ^uld  deletions  of  pixels,  as  well  as  small-scale  stretching  and 
dilation  of  a  curve. 

Since  for  each  approximation  level  d„  there  is  a  corresponding  level  in 
the  desriptor  graph,  dynamic  programming  csm  be  used  for  matching 
convex/concave  regions  just  as  it  can  on  the  individual  pixel  level. 

An  issue  is  the  precise  confutation  of  an  approximating  slab  of  thickness 
d  for  a  curve  C.  An  appealing  heuristic  is  the  minimization  of  the  total 
mechanical  strain  energy  in  the  slab: 

K^(s)dp  where:  L  -  length  of  the  curve  C 

ds  =  element  of  arc  length 

k(s)  =  curvature 
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Fiq«  3~4;**Beam  thinninq"decomposition  of  an  image  curve 

However,  the  previous  measure  is  for  curves,  not  beams.  Moreover,  it  is 
scale-variant. 

A  better  measure  of  stress  energy  for  curves  is  the  scale-invariant 
measure  proposed  by  (Bruckstein  and  Netravali,90] J 

C*{'?(s)}  -  Ll^kMs)ds  -  L;^[dY(s)/ds)2ds 

where  Y(s)  ■  the  Y-s  representation  of  the  curve  C,  ie.  the  tangent  angle 
as  a  function  of  arc-length  s. 

Ihis  measure  will  have  to  be  extended  to  a  beam  of  non-zero  thickness  d. 

Another  algorithmic  problem  concerns  the  determination  of  the  beeun  such 
that  it  covers  the  given  curve  subject  to  minimizing  this  sclae-invariauit 
energy  measure. 

Finally,  there  is  the  possibility  of  using  eigenvector  decompostion  for 
further  hashing  of  the  "feature  vector"  corresponding  to  tb-^’  mi-  . 
description  at  any  level  of  the  descriptive  hierarchy  graph. 

The  proposed  method  has  the  advantage  of  providing  a  "coarse-fine" 
representation  that  is  gaining  acceptance  for  image  analysis  in  general. 
However,  the  choice  of  a  robust  computational  method  to  achieve  this  has 
been  a  problem  in  the  past.  Fourier-based  and  other  globally-symmetric, 
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orthononnal  basis  fionction  nethcxis  methods  tend  not  to  be  responsive 
enough  locally  in  a  compact  form. 

The  proposed  method  is  also  stable  enough  to  allow  the  use  of 
convexity/concavity  concepts  v^ich,  in  the  past,  have  been  unstable 
descriptors  for  noisy  curves. 


3.2.3  Combined 

An  ap^aling  approach  is  to  combine  the  area  and  contour-based  methods 
described  in  the'  previous  sections  into  a  procedure  vdiich  exploits  the 
strengths  of  each. 

It  has  been  found  that  the  use  of  correlation  methods  described  in  section 
3.3.1  depends  heavily  on  knowledge  that  the  images  to  be  mapped  are 
rotationally  aliped.  The  remapping  and  correlation  algorithms  are  very 
sensitive  to  residual  rotational  errors  between  the  images. 

Vector-based  methods,  however,  can  be  used  to  indicate  auid  quantify  the 
amount  of  residual  rotation  present  between  the  SAR  and  the  optical 
images.  Thus  the  vector-based  contour  methods  could  be  used  initially  on 
^  EO-SAR  image  pair  to  determine  the  amount  of  rotation  between  the  two 
images.  This  method  could  be  used  at  several  locations  in  any  given 
image.  From  the  rotational  offsets  and  the  euigle  of  the  vectorized 
contour  at  each  location,  an  origin  and  angle  of  rotation  can  be  estimated 
for  the  entire  image  set.  One  of  the  images  can  then  be  rotated  2uid 
resampled  to  match  the  other.  The  correlation-based  matching  metrics  can 
then  be  expected  to  provide  in^jroved  performance. 

3.3  Results 

The  results  of  the  SAR-optical  image  registration  experiments  are 
discussed  in  the  following  subsections.  The  results  of  area-based  methods 
are  described  in  section  3.3.1,  v^ile  contour-based  streak  detection  eund 
matching  methods  appear  in  section  3.3.2. 


3.3.1  Area-Based  Matching  Results 


As  previously  described  in  [Curlander  and  Kober,  89],  a  full  range  of 
tests  comparing  the  Meucimum  Noise  Fraction  Transformation  and  Karhunen- 
Loeve  remapping  methods  with  more  conventional  algorithms  as  normalized 
correlation  and  edge  mapping  for  SAR-optical  registration.  Table  3-1 
shows  the  results  of  these  tests  on  15  image  sites  in  four  Landsat  (Band 
4)-SEASAT  image  pairs.  Each  region  in  the  SAR  image  uses  80x80  pixels  and 
each  region  in  the  IR  images  uses  60x60  pixels.  In  addition  to  the  Yuma 
data  set  shown  in  Fig.  2-1  and  2-2,  similar  data  was  available  fo'  the 
Altamaha  River  GA  region  and  the  Wind  River  Basin  WY  region  {Curlandei.  and 
Kober, 89). 
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Hie  Karhunen-Loeve  (K-L)  method  utilizes  a  nonnalized  correlation  of  the 
K-L  remapped  regions  of  the  SAR  image  with  histogram  remapped  regions  of 
the  optical  image.  The  histrogram  remap  is  a  global  operation,  while  the 
K-L  transformation  is  a  local  operation.  The  normalized  correlation  is 
then  operating  on  two  images,  one  of  vdiich  changes  with  each  candidate 
offset  \diile  the  other  remains  constant. 

It  has  been  found  that  the  correlation  between  the  K-L  remaining  and  the 
histogram  remapping  produces  sharper,  more  distinct  correlation  peeJts  them 
are  produced  normalized  correlation  or  usual  methods  of  edge  matching. 

An  example  of  these  results  is  shown  in  Fig.  3-5  for  a  region  of  the 
Altamedia  River  GA  region,  for  vdiich  there  was  also  collateral  coverage 
from  Lemdsat  TM  and  SEASAT  SAR. 

The  Meucimum  Noise  Fraction  Transformation  (NNF)  method  uses  a  nonnalized 
correlation  of  the  NNF  transformed  SAR  image  and  the  optical  image.  For  a 
given  optical  image  chip,  the  MNF  transform  of  a  corresponding  SAR  image 
chip  is  performed  for  each  trial  offset  in  the  SAR  image. 

As  an  extension  to  this  method,  the  decorrelation  filter  was  used  as  a 
pre-processor  to  enhauice  the  edge  content  of  both  images.  Then  the  MNF 
treunsform  was  applied  to  the  SAR  image  as  described  2dx>ve,  followed  by 
normalized  correlation  between  this  remapped  SAR  chip  and  the  optical 
chip. 

The  results  of  the  MNF  method  proved  promising,  and  the  additional  prior 
use  of  the  decorrelation  filter  increased  the  correlation  coefficient  in 
the  match  surface  in  mo'it  cases  (see  Table  3-2). 

3.3.2  Contour-Based  Matching  Results 

The  intent  in  using  contour-based  matching  was  to  utilize  long,  linear 
features  of  interest,  such  as  roads  or  power  lines.  Such  features  are 
more  likely  to  be  highly  visible  in  the  optical  images,  which  generally 
have  a  higher  signal  to  noise  ratio  and  often  are  higher  resolution. 

Once  these  linear  features  have  been  segemented  from  the  EO  imagery,  they 
are  used  to  cue  a  variety  of  segmentation  and  matching  methods  for  the  SAR 
image.  Also,  since  these  features  form  image  contours,  they  can  be  used 
to  estimate  the  residual  rotation  between  the  SAR  and  optical  images. 
Therefore,  contour-based  matching  can  be  used  as  a  step  for  removing  such 
residual  rotation  prior  to  area-based  matching,  which  are  especially 
sensitive  to  such  rotational  errors. 

The  variedsle  profiling  algorithm  was  tested  on  several  Landsat  images.  It 
was  found  that  due  to  the  relatively  low  noise  content  of  the  Landsat 
data,  it  was  possible  to  use  a  relatively  small  profile  sum  region.  As  a 
result,  profiling  in  the  vertical  direction  only  provided  sufficient 
filtering  to  extract  contours  up  to  45®  with  respect  to  the  horizontal.  A 
complementary  profiling  in  the  horizontal  direction  should  suffice  to 
reliably  extract  all  contours  of  interest. 
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The  width  of  the  region  over  which  the  profile  is  summed  determines  both 
the  level  of  noise  tolereunce  and,  as  well  as  the  angle  of  the  line  that 
can  be  detected.  Wider  profiles  achieve  greater  noise  immunity  but  limit 
the  emgle  of  the  detectable  line.  This  is  the  reason  for  using  the 
profiler  first  on  the  lower  noise  optical  imagery  to  provide  cues  for 
searching  in  the  higher  noise  SAR  imagery.  Once  the  direction  of  a  line 
is  known,  the  profiler  can  be  customized  for  direction  in  the  high  noise 
imagery  so  that  a  wider  profile  can  be  used. 

The  results  of  the  variable  profiler  on  the  Landsat  band-4  image  of  the 
Yuma  scene  is  shown  in  Fig.  3-6.  Ckice  such  an  image  has  been  created, 
unconnected  segments  are  removed  and  adjacent  segments  are  connected 
together  into  streaks  which  are  stored  as  linked  lists.  !Ihese  lists  can 
then  be  accessed  according  to  length  for  use  in  cueing  potential 
corresponding  locations  in  the  SAR  imagery. 

These  lists  could  also  be  used  in  image-map  matching,  assuming  the  map 
data  were  in  a  digital  form  such  as  DMA's  DFAD.  The  results  of  the 
variable  profiling  method  for  streaks  of  length  4  segments  is  shown  in 
Fig.  3-7.  similarly  the  results  for  streak  lengths  longer  tham  8  segments 
is  shown  in  Fig.  3-8. 

Once  potential  streaks  have  been  identified  as  possible  features  to  be 
identified  in  the  SAR  imagery,  a  variety  of  algorithms  can  be  used  to 
segment  them  in  the  SAR  image.  Following  stre^  segementation,  image 
coordinates  are  used  for  resampling. 

Given  a  streak  and  its  extent  in  the  optical  image,  mission  parameters 
define  a  region  of  uncertainty  in  the  SAR  image  within  which  the 
corresponding  feature  must  exist.  In  the  worst  case,  this  region  will 
contain  the  entire  SAR  image.  More  likely,  however,  the  uncertainty 
region  involves  only  a  smaller  subset  of  the  SAR  image. 

Within  this  uncertainty  region,  a  "total  profile"  is  created.  The  total 
profile  consists  of  a  surface  generated  by  multiple  summed  profiles  at 
successive  trial  offsets  within  this  region.  At  each  trial  offset,  a  set 
of  profiles  corresponding  to  the  shape  of  the  streak  extracted  from  the 
optical  image  is  summed,  as  described  in  section  3. 2. 2.1.  Fig.  3-9  shows 
two  total  profiles  generated  from  the  SEASAT  SAR  of  the  YUma  site,  and  the 
corresponding  profiles  superinfx^sed  with  the  original  SAR  images. 

Each  offset  in  the  total  profile  corresponds  to  pixel  in  the  SAR  image. 
Analysis  of  these  profiles  euid  conparison  with  the  correspondinng  SAR  and 
optical  image  pair  shows  clearly  that  the  total  profile  method  produces  a 
surface  containing  a  peak  or  ridge  along  the  exact  location  of  the  target 
streak  in  the  SAR  image.  This  peak  is  especially  pronounced  fc  the 
river  and  road  features  in  the  Yuma  site. 
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Normalized  Correlation  Test  if  1 


Histogram  Remap  Test  if  1 


Edge  Matching  Test  #7  K-L  Remap  Test  #  7 


Fig.  3-5:  Correlation  surfaces  derived  from  normalized  correlation  of 
indicated  remapped  versions  of  the  Landsat  Band  4  and  SEASAT 
image  pairs  of  Altamaha  River  GA  site. 
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Table  3  -2  Numerical  results  of  contour-based  registration. 


Feature 

Location  in 
Optical  Imagery 

Expected 

SAR  Offset 

Best 

SAR  Offset 

Error  (pixels) 

River 

199,  192 

200.  192 

204,192 

4 

Road  1 

7,  300 

7,302 

7,301 

1 

Road  2 

261, 430 

266,427 

266,425 

2 

Railroad  (diag) 

188, 426 

188.423 

189,423 

1 

Table  3-3  NumeiicaJ  results  for  combined  contour-area-based  registration. 


Test 

Location  in 
Optical  Imagery 

Expected 

SAR  Offset 

Best 

SAR  Offset 

Correlation 

Coefficent 

#5  K-L 

370. 430 

358, 421 

358,422 

.8775 

#5  MNF 

370, 430 

358, 421 

370,420 

.9308 

#6  K-L 

285, 338 

273,  229 

293, 231 

.7431 

#6  MNF 

285,  338 

273,  229 

271,223 

.9758 
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Ttie  streak  first  extracted  from  the  optical  image  then  acts  as  a  mask  for 
the  a  sum  of  absolute  differences  correlation  (see  section  3.1.2).  Since 
the  total  profile  surface  has  its  maximum  peaks  along  a  shape  which 
mirrors  the  feature's  imaged  shape,  the  correlation  metric  should  yield 
the  correct  feature  offset  within  the  SAR  image.  The  resulting 
correlation  surfaces  using  the  sum  of  absolute  differences  metric  for 
comparing  the  total  profile  and  the  optical  streak  mask  are  shown  in  Fig. 
3-6,  with  numerical  resialts  tabulated  in  Table  3-2. 

As  can  be  seen  in  Fig.  3-10,  the  correlation  method  produces  distinct 
peaks  at  the  correct  offset  location  of  the  start  of  the  optical  profile 
in  the  SAR  image.  The  surface  given  in  Fig.  3-7a  is  slightly  misleading 
in  that  the  peak  in  the  foregrovind  is  actually  musch  larger  than  the  peak 
in  the  background,  but  are  similar  looking  because  of  the  perspective 
distortion  of  the  surface.  The  road  feature  shown  in  Fig.  3-9  auid  3-10 
referes  to  RCAD  2  in  TaOsle  3-2. 

As  an  enh2mcement,  the  profiling  in  the  SAR  image  can  be  enlarged  in  the 
direction  along  the  candidate  streak  so  that  the  profile  is  more  robust  to 
speckle  but  still  sensitive  to  the  required  shape. 

Several  other  algorithms  for  segmenting  cued  features  from  SAR  imagery  are 
documented  in  [Adair  et  al,90].  The  methods  discussed  include  Ramdomness 
Center  of  Mass,  Kolmogorov-Smirnov  Test,  difference  Edge  Detector,  and 
Mean-Squared  to  Variauice  Ratio  test.  These  methods  will  be  investigated 
in  Phase  ll. 

The  correspondences  found  by  such  methods  produce  a  set  of  "tie  points" 
which  are  then  interpolated  to  form  a  resampling  grid  (see  section  4.2). 

3.3.3  Combined  Area-Contour  Matching  Results 

It  has  been  found  that  successful  use  of  the  correlation  methods  described 
in  section  3.3.1  strongly  depends  on  eui  initial  rotational  alignment  of 
the  image  pair.  The  remapping  and  correlation  algorithms  are  quite 
sensitive  to  residioal  rotational  errors. 

Contour-based  methods,  however,  can  be  used  effectively  to  estimate  this 
residual  rotatiort  between  corresponding  vectorized  contours  in  the  two 
images  at  various  locations.  Following  a  global  rotation,  one  of  the 
images  is  then  resampled  to  match  the  other.  This  process  is  then 
followed  by  an  area-based  matching  process. 

The  streak  algorithm  was  applied  on  the  SAR  and  TM  imagery  of  the  Yuma 
site,  and  a  residual  rotation  between  them  of  1.73°  was  estimated.  This 
rotational  error  was  corrected  using  a  simple  bilinear  interpolati''n  on 
the  TM  image. 

Tests  #5  and  #6  (see  Table  3-1)  were  performed  again  using  the  K-L  and  MNF 
transformations.  A  comparison  of  Tables  3-1  and  3-2  shows  a  definite 
improvement  in  the  correlation  coefficient  of  the  results  for  the  MNF 
method.  However  little  inprovement  can  be  noted  in  the  actual  offset 
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determined  by  the  methods,  except  in  the  case  of  the  K-L  which  exhibits 
marked  improvement  from  an  error  of  12  pixels  to  aui  error  of  1  pixel  with 
high  correlation  of  about  .88. 

3.4  Remaining  Problems 

Considerably  more  testing  and  evaluation  of  test  results  over  a  wider 
variety  of  sensing  scenarios  must  be  perfomed  to  better  understeuid 
qualitatively  the  circumstances  lander  vrtiich  the  area-based  and  contour- 
based  registration  algorithms  are  successful  or  fail.  Once  such  a 
qualitative  understeinding  is  achieved,  a  quantitative  model  for  the 
registration  process  can  be  attenpted. 

A  goal  for  Phase  II  should  be  to  develop  a  performance  prediction  model 
for  the  success  of  registering  an  image  pair  by  a  given  method,  given 
quantitative  sensor  parameters  as  inputs. 
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4. 


Sub-Pixel  ReqistraticMi  Estimaticyi 

In  this  section  is  a  discussion  of  two  sub-pixel  translation  estimation 
algorithms.  Such  local  estimates  are  then  interpolated  in  order  to  obtain 
a  global  deformation  grid  valid  for  the  entire  image.  To  support  pixel- 
level  change  detection  then  requires  the  application  of  a  resampling 
operation. 

After  inital  matching  to  the  nearest  pixel  has  been  accomplished,  often 
what  is  done  is  to  apply  some  type  of  interpolation  fxinction  of  the 
matching  function  surface  in  the  neighborhood  of  the  correct  offset.  This 
serves  to  locate  the  position  of  the  extreme  point  of  the  match  surface  to 
sub-pixel  precision.  The  choice  of  appropriate  interpolation  function  is 
an  issue. 

How3ver,  examing  this  problem  from  a  signal  processing  point  of  view,  the 
phase  function  of  a  frequency  component  in  the  Fourier  Traunsform  of  an 
image  is  much  more  sensitive  to  misalignments  than  the  corresponding 
magnitude  function.  This  observation  is  the  basis  for  most  of  the  recent 
work  in  sub-pixel  registration. 

4.1  Estimation  of  Local  Translation 

The  preferential  use  of  the  Fourier  phase  is  the  basis  of  the  two 
algorithms  below.  Algorithm  #1  exploits  the  Fourier  Shift  Theorem  for 
sub-pixel  offset  values  by  differencing  phases  for  corresponding 
frequencies.  Algorithm  #2  uses  a  sophisticated  interpolation  of  the  phase 
correlation  surface. 

4.1.1  Algorithm  #1 

The  following  is  a  1-D  version  of  an  algorithm  for  sub-pixel  registration 
estimation.  It  estimates  rational  sub-pixel  offsets  using  analytic 
relationships  among  non-integer  shifts  involving  the  Discrete  Fourier 
Treuisform  (DFT). 

One  would  like  to  exploit  the  linear  phase  relationship  for  arbitrary 
translations  that  exists  for  the  continuous  function  version  of  the 
Fourier  Shift  Theorem  {Papouiis,77] . 

f(t-a)  <-»  e-:'*"F(w) 

Miere  f,F  indicate  the  corresponding  time-domain  and  Fourier  domain 
functions . 

Briefly,  the  1-D  version  is  as  follows: 

o  Apply  the  FFT  to  each  row  separately  in  both  windows.  For  example, 
the  i^**  row  in  the  window  of  images  1  and  2  would  be: 

(a^  —  FFT } 

{ bj  ,  .  .  . ,  bjj  )  I  ^  j  =  FFT{  ( Xj  ,  .  . .  ,Xjj  )  j  £  {  } 
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Do  for  all  rows  i,  i  -1,...,M,  in  both  images. 

o  For  each  pair  {aj,bj}  in  each  row  i,  with  k  corresponding  to  the 
frequency  index,  v^ere 

aj  e(aj , . . .  ,aj, )  ( i  j ,  e(bj^ , . . .  ,bf, ) ,  ^  j 

form: 

I(a)J-  imag  {aj},  l(b)J-  imag  {t^ } 

R(a)J-  real  {aj},  R(b)J-  real  {t^} 

<(>(a)J-  Tan-Ml(a)iL/R(a)j^ ) 

♦(b)i-  Tan-Ml(b)J/R(b)J ) 
di-  [♦(a)i-  ♦(b)i]/(2Tik/N) 

dl-  (1/^) 

o  Cluster  the  {d! } .  The  values  of  the  dominant  cluster  represent  the 
horizontal  shift  value.  These  clustered  values  should  be  averaged  or 
used  in  some  estimation  procedure. 

Only  the  1-D  version  of  this  algorithm  has  been  tested.  The  extension  to 
2-D  is  expected  to  occur  in  Phase  II.  An  example  of  the  performance  of 
this  procedure  as  a  function  of  frequency  is  shown  in  Fig.  4-1. 


Fig.  4-1: Behavior  of  Phase  for  Sub-Pixel  Offsets 
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Some  additional  theoretical  considerations  for  the  above  algorithm  follow 
below.  In  particular,  the  effects  of  trying  to  estimate  a  sub-pixel 
offset  for  two  sairpled  images  without  prior  low-pass  filtering  are 
discussed. 

Suppose  there  are  two  1-D  time  sequences,  g(k)  and  h(k).  Suppose  further 
that  h(k)  represents  a  sub-pixel  shifted  version  of  of  g(k).  The 
following  represents  a  model  which  is  appropriate  for  estimating  this  sub¬ 
pixel  offset. 

Let  f(t)  be  a  continuous  time,  finite  energy  signal  and  let  F(«)  be  the 
corresponding  continuous  Fourier  transform.  Then  let  g(k)  be  a  sampled 
version  of  f(t). 

Therefore : 

g(k)  -  f(kto )  where  tg  is  the  sanpling  frequency. 
h(k)  -  f(kto-a) 

Since  g(k)  cind  h(k)  are  both  discrete  time,  finite  energy  sequences,  the 
appropriate  version  of  the  Fourier  transform  is  the  Discrete  Time  Four:.-»r 
Transform  (DTFT).  We  denote  these  correspondences  between  domains  as: 

g(k)  ♦-(DTFT)->  G(exp(  jwtg  ]) 

h(k)  ♦-(DTFT)-*  H(exp[  jwtg  ] ) 

The  DTFT  and  its  inverse  are  described  in  (Roberts  and  Mullis,87],  ie: 
g(k)  -  (to/2ii)X"°G(exp(  jwtg  ]  )exp(  jwkto  ]dw 
G(exp[jwtgl)  -  i:;;._.g(k)exp(-jkwto  ) 
where  2n/tg 

The  characterization  of  G(exp(jcotoI)  and  H{exp(jwtg])  in  terms  of  F(jw), 
the  continuous  time  Fourier  transform  of  the  continuous  function  f(t)  is 
now  required. 

It  is  shown  in  [Roberts  and  Mullis,87]  that: 

G{exp[jwto])  *•  J^._.F(  joH-jno^  ) 

Therefore  G(exp[  jcotg  ] )  is  represented  as  an  infinite  sum  of  shifted 
versions  of  F(ja)).  If  Fijw)  is  band-limited  to  w  =  then: 

F(j(*M-n(^,)  -  0  for  all  n^O  (4-1) 

This  implies: 

G(exp[  icotg  ) )  =  F(  jw) 
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If  the  condition  in  equ  (4-1)  is  not  satisfied,  then  G{exptjwtQ])  will  be 
an  aliased  version  of  F( jw) . 

In  order  to  describe  H(exp[  jcotg  ] ) ,  we  let: 

h(k)  -  tof(kto-a) 

Using  the  inverse  DTFT  on  h(k)  and  the  inverse  Fourier  transform  on  f(t), 
we  obtain: 

(t(,/2ii) /“O  H(exp[  jwtg  ]  )exp[  jwkto  ]  dw 

-  (to/2ii)n.F( jw)(exp( jw(kto-a)l)  dw  (4-2) 

But  the  right  hand  side  of  (4-2)  can  be  expressed  as: 

(to/2ii)X!,F( jo))(exp[ j«(kto-a)])  d« 

-  (to/2a)  J^,F(  ja))(exp(  jwkto  ]  )exp[-jwa]  da>  (4-3) 

Breaking  the  infinite  limits  of  the  integral  into  a  sum  of  finite 
segments,  (4-3)  can  be  expressed  as: 

(to/2ii)  J!.F(  jw)(exp(  jcokto  ]  )exp(-j<oal  dw 

-  (to/2ii)I^._.  Ji2J^'“°F(jw)exp(jajkto  )expt-jwa]  dco  (4-4) 

Let  ♦  ■  o*-nM(j ,  then  dt  ■  dca.  This  change  of  variables  converts  (4-4)  to: 
(to/2n)  J:.F(  jco)(exp[  jwkto  ]  )exp[-jwa]  dw 

-  (to/2ii)I^.,.  J“°exp(-j(<tH-n<«^  )al  F(  j-H-jnu^  )exp(  jk<|»to ) 

exp[  jknc^tg  ]  d<<t  (4-5) 

But  the  exponential  factor  exp(  jkno^  tg  J  -  1  since  o^jtg  -  2ii.  Therefore, 
the  previous  integral  expression  is  given  by: 

(to/2n)j:;;,_.;"°exp[-j(<|H-na^  )a]  F(  jHjnub  )exp[  jk+tg  ]  d+ 

Therefore,  using  absolute  convergence  one  can  interchange  the  svim  and 
integral  iFulks,66]: 

(tg/2Jt)/^°  H(exp[  jwtg  ]  )expl  jwktg  ]  dw 

-  (to/2n)E;;._.;^°exp(-j(«|>+n<»^  )a]  F(  j^+jna:^,  )exp[  jkftg  ]  d(|) 

-  (tg/^n)/^®  2:;;._,exp[-j(4H-nw|,  )a]  F(  j4>+jna:^  )exp[  jk^-tg  ]  d<f>  (4-6) 

Conparing  the  right-hand  side  of  (4-6)  with  the  left-hand  side  of  (4-2) 
implies: 

H(exp[jwto])  =  Z^,_.F(  jwf jno^j  )  exp[-j(wfn(*:^  )a]  (4-7) 
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Now,  if  a  -  0,  then: 

H(exp[juto])  -  G{exp[  jwto  ] )  -  j<»x-jna^  )  (4-8) 

Equations  (4-7)  and  (4-8)  will  now  be  used  to  estimate  the  sub-pixel  shift 
a.  Suppose  F(jc»))  is  beind-limited  to  %/2,  then: 

F(jw)  -  0  for  |wl>a:^,/2,  ie.  F(jaH-jn<»^)  ■  0  for  n^O. 

This  inplies  that: 

G(exp[jwto])  -  F(ju) 

H(exp[j«to])  -  F(jw)  exp[-jaa] 

Using  a  discrete  Fourier  transform,  one  can  compute  samples  of 
G(  exp(  jwto  ] )  and  H(  exp[  jwtg  ] ) . 

Taking  a  ratio  of  the  samples  of  G(exp( jwtg ] )  euid  H(exp[ jwtg ] ) ,  one  can 
compute  exp(jooal  vrtiich  gives  the  sub-pixel  shift  a. 

If  F(jw)  is  not  band-limited,  then  F(  jaM-jncc^  )  is  non-zero  for  non-zero 
values  of  n.  This  results  in  eui  aliased  spectrum  in  both  cases.  As  a 
result  of  the  aliasing,  one  cannot  easily  conpute  the  sub-pixel  offset. 

If  f(t)  is  not  a  band-limited  signal,  it  can  be  frequency  filtered  to  make 
it  band-limited.  It  is  critical  that  the  filtering  be  performed  prior  to 
sanplinq.  Otherwise,  aliasing  will  again  result. 

4.1.2  Algorithm  #2 

Another  local  translation  estimator  which  was  investigated  in  Phase  I  is 
the  phtse  correlation  algorithm  [Schaxan  and  McHugh, 88],  [Stocker  and 
Claytor,89],  ( Stocker, 90] ] .  Although  phase  correlation  is  generally  known 
to  be  aa  effective  method  for  measuring  the  translational  misregistration 
between  a  pair  of  optical  image  sub-frames,  there  are  several  "fine 
points’  involved  in  applying  the  technique  to  discrete  data.  In  the 
following,  we  review  the  basic  phase  correlation  approach  and  its 
underl"  ing  model,  discuss  several  important  details  of  its  implementation, 
show  srme  test  results  on  synthetically-shifted  data,  and  apply  the  method 
to  measure  residual  misregistrations  in  the  Death  Valley  scenes. 

Basic  Model.  Consider  a  continuous  optical  background  scene  s(x,y)  that 
is  projected  onto  the  focal  plane  on  Freime  1.  At  Frame  2,  the  same  scene 
appears  displaced  by  an  arbitrary  2-D  spatial  offset  (a, 0)  due  to  apparent 
backgro^ind  translation.  The  images  corresponding  to  these  two  frame;'  are 
defined  as 

<t>i  (x,y)  -  s(x,y)  (4-9a) 

<f'2(x,y)  -  s(x-a,y-0)  (4-9b) 
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Let  S(u,v)  be  the  2-D  Fourier  transform  of  the  scene  s(x,y),  v^ere  (u,v) 
are  spatial  frequencies  corresponding  to  the  focal  plane  position 
variables  (x,y).  The  2-D  Fourier  transforms  of  the  two  frames  are 

<J>^(u,v)  -  S(u,v)  (4-lOa) 

♦2(u,v)  -  S(u,v)e-:i2>'(«u+Pv)  (4_iob) 

The  2-D  shift  manifests  itself  as  an  additive  linear  phase  in  each 
dimension  of  the  spatial  frequency  domain. 

The  optical  images  (x,y)  and  ♦,  (x,y)  are  san^)led  on  the  focal  plane  to 
generate  the  discrete  sensor  frames  f^(k,l)  and  f3(k,l).  We  assume 
without  loss  of  generality  that  the  scene  transform  C(u,v)  is  bandlimited 
by  the  sensor  optics  and  detectors  to  the  spatial  frequency  interval  [- 
0.5,  +0.5]  in  both  the  u  and  v  dimensions.  Then  the  san5)ling  interval  can 
conveniently  be  defined  as  unity  in  each  spatial  coordinate.  If  each 
frame  is  space-limited  to  N  pixels  in  the  k  emd  focal  pl2urie  coordinates 
and  viewed  as  a  periodic  function,  then  it  can  be  shown  that  the  discrete 
Fourier  transforms  defined  by 

Fi(m,n)-  (k,l)exp{-j2n(km+ln)/N}  (4-11) 

i-1,2 


provide  N  samples  of  the  respective  frame  Fourier  transforms  ♦i(u,v)  on 
the  spatial  frequency  repetition  interval  from  [-0.5,  +0.5]  in  both  u  and 
V.  These  san^jles  occur  at  evenly-spaced  frequency  increments  of  1/N  in 
either  dimension,  so  that 

Fi(m,n)-  ♦i(nv^,n/N)  (4-12) 


-N/2  ^^/2-l ,  -N/2  ^^/2-l 

and  from  4-(10)  we  can  write 

Fj (m,n)  -  S(m/N,n/N)  (4-13a) 

Fj(m,n)  -  S(nvT4,n/N)e-i2M<.«+en)/M  (4_i3b) 

To  simplify  the  notation,  we  will  use  S(m,n)  to  represent  S(nv/N,n/N)  in 
the  subsequent  discussion. 

Phase  Qjr relation.  The  phase  correlator  is  a  whitened  cross-correlation 
estimate  defined  by 

r(Ax,Ay)-  E„E„w(m,n)  [F‘ (m,n)F2  (m,n)  ]/[  |F‘ (m,n)F2  (m,n)  |  ]  x 
exp{j2Tt(mAx  +  nAy)/N} 

as  a  function  of  continuous  2-D  lag  variables  (Ax, Ay).  The  spatial 
frequency  weight  w(m,n)  is  included  here  for  generality;  its  role  is 
discussed  later. 
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For  the  pair  of  perfectly  translated  frames  F,  and  F,  defined  in  (4-13), 
the  above  phase  correlation  function  evidently  peaks  at  the  true  2-D 
offset  (Ax, Ay)  -  (a, 0).  The  translation  laetween  the  two  frames  can 
therefore  be  determined  by  precisely  measuring  the  location  of  the  peak  of 
(4-14). 

Phase  correlation  is  best  viewed  as  a  weighted  cross-correlation  of  two 
"vdiitened"  image  frames  whose  spatial  frequency  components  are  defined  by 
Fj  (m,n)/|Fi  (m,n)  I  and  F^  (ra,n)/|Fj  (m,n)  | ,  respectively.  A  ^itened  frame 
retains  only  the  phase  information  in  the  original  spatial  frequency 
components,  eind  is  often  referred  to  as  a  phase-only  image.  Spectral 
whitening  is  a  nonlinear  filtering  process  which  tends  to  enhance  the 
high-frequency  scene  features  (such  as  edges)  that  contain  most  of  the 
useful  information  for  registration. 

Since  meiny  natural  background  scenes  from  electro-optical  sensors  tend  to 
have  a  substantial  amount  of  power  concentrated  at  lower  spatial 
frequencies,  a  conventional  (unwhited)  cross-correlation  often  produces  a 
relatively  broad  output  peak.  The  spectral  whitening  used  by  the  phase 
correlation  procedure  tends  to  sharpen  the  peak,  leading  to  more  precise 
measurements  of  frame  displacement. 

Discrete  Tmplenentation.  The  phase  correlator  in  (4-14)  is  theoretically 
defined  on  ^e  continuous  2-D  spatial  variables  (Ax, Ay).  However,  in  a 
digital  computer,  the  correlation  is  actually  computed  at  discrete 
intervals.  If  the  desired  lag  interval  in  each  dimension  is  selected  as 
one  pixel,  then  the  phase  correlation  can  be  computed  by  the  2-D  inverse 
DFT 

r(p,q)- 

l/[  |Xi  (m,n)Xj  (m,n)  I  ]  x  exp{j2ii(pm+qn)/N}  (4-15) 

for  integer-valued  pixel  lags  (p,q)  in  the  reunge  -N/2, . . . ,N/2-l.  Note 
that  the  DFT  summations  are  performed  symmetrically  about  the  dc  spatial 
frequency  conponent  located  at  (ra,n)  -  (0,0)  to  ensure  a  real-valued 
result. 

The  phase  correlation  saunples  in  (4-15)  can  be  found  by  brute-force 
calculation  or  by  taking  eui  NxN  inverse  FFT  of  the  weighted  cross-spectrum 
phase  function.  The  FFT  approach  is  most  efficient  unless  the  required 
number  of  lags  is  know  to  be  fairly  small.  It  provides  circular  phase 
correlation  samples  for  pixel  lags  ranging  from  -N/2  to  N/2-1  in  either 
spatial  dimension.  A  non-circular  implementation  can  be  obtained  by 
zerofilling  both  frames  prior  to  the  forward  FFTs,  but  this  requires  a 
larger  FFT  size.  A  less-expensive  way  of  avoiding  problems  dno  to 
circular  correlation  and  the  non-periodicity  of  actual  frame  data  is  to 
apply  a  tapered  weighting  function  (such  as  a  2-D  Hanning  window)  to  the 
frames  prior  to  the  forward  FFT  in  (4-11)  that  transforms  them  to  the 
frequency  domain. 

The  frequency-dependent  weighting  function  w(m,n)  in  (4-15)  is  used  to 
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emphasize  those  regions  of  the  cross-spectrum  phase  function  that  contain 
useful  phase  information.  This  function  is  generally  chosen  to  have  a 
low-pass  characteristic  (e.g.,  1  2-D  Hanning  or  Gaussian  window) .  The 
reason  is  that  in  most  sensor  data,  the  highest  spatial  frequencies  tend 
to  be  heavily  corrupted  noise  and  aliased  conponents;  thus  attenuating 
these  components  with  frequency-selective  weighting  improves  the 
misregistration  estimates. 

The  location  of  the  peak  in  the  phase  correlation  (4-15)  is  a  measure  of 
the  relative  displacement  or  misregistration  between  frames  1  and  2.  To 
estimate  the  displacement  to  sub-pixel  accuracy,  the  peak  position  must  be 
accurately  interpolated  from  the  finite  set  of  samples.  The  classical 
solution  to  this  problem  involves  the  application  of  the  appropriate  2-D 
"sine"  interpolation  function  on  the  discrete  cross-correlation  estimate. 
A  far  more  practical  approach  is  to  fit  a  polynomial  to  the  cross¬ 
correlation  samples  in  the  vicinity  of  the  highest  peak.  A  separable 
quadratic  polynomial  is  the  sinplest  choice  since  it  uses  only  three 
points  in  either  dimension:  the  peak  sample  and  its  neighbors  to  either 
side.  Although  the  translation  estimates  provided  by  this  simplified 
interpolator  are  biased,  it  can  be  shown  that  the  bias  is  a  function  of 
known  processing  parameters  [Stocker  and  Clayton, 89],  [ Stocker , 90 ] .  Thus, 
it  is  possible  to  correct  the  measurement  biases  on  the  fly  using  pre¬ 
computed  lookup  tables. 

A  complete  block  diagram  of  the  phase  correlation  based  registration 
algorithm,  including  the  bias-corrected  peak  measurement  step,  is  shown  in 
Figure  4-2,.  This  algorithm  has  been  successfully  calibrated  on 
synthetically-shifted  data  as  noted  below. 
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Fig.  4-2; 


Block  diagram  of  Registration  Algorithm  Based  on  the  Phase 
correlation  Method 
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Fig.  4-3; 


(b) 

Correlation  Block  Translation  Measurements  for  E)eath  Valley 
Data,  (a)  horizontal  dimension,  (b)  vertical  direction. 
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Weighted  Difference  Images  for  Death  Valley  (1983  and  1988 
Scenes)  -  Band  1,  (a)  After  Global  Registration,  (b)  After  Fine 
Registration. 


4.2  Global  Interpolation  of  Local  Estimates 


Following  the  computation  of  offsets  for  smaller  sized  blocks  of  pixels, 
an  interpolation  process  will  be  required  to  attain  "subpixel"  accuracy. 

One  problem  with  global  least-squares  matching  for  interpolation  of 
registered  control  points  is  that  errors  are  averaged  over  the  entire 
image.  In  fact,  vrfiat  is  realy  required  is  to  find  cin  interpolatin  scheme 
that  is  only  influenced  locally  by  the  data,  since  the  distortions  due  to 
terrain,  atmospheric  effects,  aind  sensor  nonlinearities  act  are  spatially 
varying  eind  so  should  be  compensated  in  a  locally  adaptive  meuiner. 

Another  approach  is  to  use  splines,  vrtiich  restricts  the  effects  to  be  more 
local.  Consider  the  surface  spline  representing  ein  infinite  plate  under 
the  imposition  of  point  loads  (Goshtasby,88] : 

Let:  f(x,y)  -  a^  +  a^x  +  ajy  +  r^^lnr^^ 

where: 

n  -  #  loads 

r^2  -  (x-x^)2  +  (y-yj^ 

(x^  ,yj^  )  -  position  of  i^**  control  pt. 

f(x,y)  -  elevation  of  surface  at  (x,y) 

Now,  substituting  control  pts.  into  equation  for  f  euid  solving  the 
following  set  of  equations  leads  to  the  parameters:  ,  a^ ,  aj ,  F^ , 

1-1, . . .n. 

0 

f(Xi  ,yi )  -  ao+  aiXi+  a2yi+  Fi  ln(r|i  ) 


fK'Yn)  =  ^0+  aiXn+  ^  =  ) 

This  method  will  be  investigated  in  Phase  II  for  interpolating 
registration  control  points. 
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Translation  Measurements 
(Control  Points) 


Surface  Fit  to  Measurements 
(Thin— Plate  Interpolating  Spline) 


Fig.  4-5:  Surface  Fit  to  a  Sparse  Set  of  Translation  Measurements 
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A  related  global  interpolation  method  was  implemented  and  tested  during 
Phase  I.  This  method  uses  the  thin^plate  oobic  spline  [Lancaster  and 
Salkauskas,86] .  Physically,  the  thin  plate  is  analogous  to  a  mechanical 
surface  spline  that  is  not  subjected  to  torques  at  the  dat  points  and 
deforms  only  by  bending.  The  thin  plat  spline  is  mathematically 
formulated  as  the  2-d  equivalent  of  the  well-known  natural  cubic  spline, 
which  is  optimal  in  the  sense  miminizing  total  stress  energy. 

The  thin  plate  is  formally  derived  from  a  Boolean  sum  of  two  different 
projection  operators: 

a)  An  interpolating  projection  formed  from  translates  of  the  basis 
function  r^ln(r^)  as  defined  above, 

b)  A  weighted  least-squares  projection  with  a  weight  matrix  equal  to 
the  inverse  of  the  Vandermonde  matrix  of  the  translates  of  the 
same  basis  function. 

It  should  be  noted  that  the  computation  of  this  spline  can  be 
computationally  relatively  expensive,  since  it  involves  the  inversion  of  a 
matrix  of  dimension  N,  vrtiere  N  is  the  r.rmebr  of  data  points  to  be 
interpolated. 

An  example  of  thin-plate  spline  interpolation  applied  to  the  generation  of 
a  global  deformation  surface  is  shown  in  Fig.  4-5.  A  set  of  2-D 
translation  measurements  using  control  points  taken  at  a  discrete  set  of 
locations  within  the  image  frame  is  created.  Then  a  thin-plate 
interpolating  spline  is  computed  for  the  2-D  apparent  translation  vs. 
pixel  position.  These  surfaces  are  then  used  for  image  resampling. 

This  technique  was  successfully  applied  to  to  perform  global  registration 
for  the  Death  Valley  IR  imagery  discussed  in  section  2.4. 

4.3  Image  Resampling 

Accurate  resampling  of  the  discrete  image  frames  is  critical  to  the 
performance  of  change  detection  schemes  based  on  weighted  frame 
differencing.  E^'en  if  the  global  registration  error  measurements  are 
perfect,  a  poorly  implemented  resanpling  step  can  introduce  significant 
noise  into  a  difference  image. 

The  use  of  sliding-window  FIR  interpolators  is  a  viable  approach  to  the 
image  resanpling  problem,  provided  that  a  suitable  interpolation  kernel  is 
selected  |Wolberg,90] .  In  this  approach,  a  discrete  image  is  resampled  by 
means  of  a  2-D  convolution  operation.  Assume  that  global  registration 
calls  for  the  image  frame  I(m,n)  to  be  shifted  by  a  variable  amount 
as  a  function  of  the  pixel  location  (m,n).  Let 

<^n'^y,nn)  ="  +  Kn  '  ^ 

where  (p,q)  and  (a,8)  denote  the  interger  and  fractional  parts  of  the 
desired  pixel  shifts,  respectively.  The  resampling  operation  produces  a 


79 


new  discrete  frame  defined  by 

Ir  'n+q„n  )“  X ( itH-k , n+1 ) h( k-o^ „  ) 

where  h(x,y)  is  the  2-D  interpolation  kernel.  In  the  present  study,  only 
separable  kernels  of  the  form  h(x,y)  -  h(x)h(y)  with  a  finite  support  of  N 
pixel  intervals  in  either  dimension  are  considered. 

The  performance  of  various  kernels  can  be  interpreted  euid  conpared  by 
considering  the  classical  digital  signal  processing  approach  to 
interpolation  [Lucke,undated] .  In  this  approach,  we  convolve  the  discrete 
frame  data  with  a  continuous  (or,  in  prc.ctice,  a  highly  oversampled) 
reconstruction  kernel  of  finite  extent;  then  decimate  the  resulting  image 
to  obtain  interpolated  sanples  on  the  desired  frame  grid.  Although  these 
steps  are  often  combined  into  one  operation  (as  in  the  above  equation),  it 
is  useful  to  think  of  them  as  two  distinct  conceptual  stages. 

The  optimum  interpolation  kernel  for  sampled  bandlimited  data  is  the 
"sine"  function  sin  (nn)/(im)  with  nulls  spaced  at  the  discrete  pixel 
intervals  indexed  n  -  +1,  -^2,  ...,  +*.  The  frequency  response  of  this 
kernel  is  constauit  over  the  normalized  spatial  frequency  interval  (-0.5, 
+0.5)  and  zero  outside  of  this  interval.  It  achieves  perfect  signal 
reconstruction  since  it  produces  no  in-beind  spectral  distortion  and 
completely  removes  the  aliases  of  the  sampled  signal  which  repeat 
indefinitely  at  the  pixel  rate.  Unfortunately,  ideal  interpolation  can 
only  be  approximated  on  a  finite-length  data  record  and,  in  any  case,  is 
too  computationally  demanding  to  consider  for  most  applications.  Thus  in 
practice,  relatively  small  kernels  with  frequency  responses  that 
approximate  the  ideal  "rect"  function  are  utilized. 

The  desired  characteristics  of  an  interpolation  kernel  are  linear  phase,  a 
nearly  flat  in~beind  response  and  very  low  out-of-band  sidelobes.  A 
reasonadjly  flat  linear-phase  passband  reduces  the  signal  distortion  due  to 
interpolation  filtering.  Low  sidelobes  minimize  the  amount  of 
interpolation  "noise"  introduced  by  aliased  signal  components  that  fold 
into  the  original  band  when  the  filtered  signals  are  decimated  (or 
resanpled)  at  or  near  the  original  sampling  rate. 

Ideally,  it  would  be  desirable  to  apply  the  same  in-band  filtering  to  each 
image  frame  to  be  differenced  or  otherwise  combined  in  a  change  detection 
operation.  This  condition  can  be  satisfied  exactly  if  each  frame  is 
resanpled  (i.e.,  filtered  eind  decimated)  in  precisely  the  same  way,  but 
this  generally  cannot  be  done  in  non-trival  registration  applications. 
The  problem  is  that  the  different  resamplings  needed  for  frame 
registra^'ion  ^pply  different  phase  shifts  to  the  out-of-band  components  of 
the  inteipolated  scene.  These  components,  which  fold  back  into  the 
original  band  upon  decimation,  constitute  a  source  of  "noise"  which  v  >rios 
in  a  rather  unpredictable  way  from  one  image  frame  to  and  hei . 
Significant  clutter  leakage  can  result  due  to  mismatches  in  the  spatial 
frequeiicy  content  of  the  interpolated  frames  being  differenced,  even  in 
cases  where  the  resampling  itself  is  geometrically  perfect. 

A  practical  solution  to  this  problem  is  to  choose  an  interpolation  kernel 
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that  attenuates  the  out-of-band  signal  components  to  the  point  where  they 
are  well  below  the  level  of  the  sensor  noise.  For  a  typical  electro- 
optical  sensor  that  provides  a  background  clutter-to-noise  ratio  on  the 
order  of  (say)  10  (or  20  dB),  this  implies  the  use  of  FIR  filter  kernels 
with  peak  sidelobes  on  the  order  of  25-30  dB  or  better. 

To  gain  some  insight  into  the  performance  of  various  FIR  interpolators,  we 
evaluated  frequency  responses  and  some  sinple  figures-of-merit  for  several 
filter  kernels.  Each  kernel  was  oversampled  by  a  factor  of  10  in  a 
symmetric  fashion  to  ensure  that  its  frequency  response  would  have  linear 
phase.  The  squared-magnitude  of  the  frequency  response  was  then 
calculated  from  the  DFT  formula.  Two  figures  of  merit  were  also  confuted 
for  each  kernel,  based  on  cin  assumed  work-case  ac  background  "signal" 
having  a  trieuigular-shaped  amplitude  spectrum  extending  out  to  the  band 
edges  at  the  normalized  spatial  frequencies  (-0.5,  +0.5).  These  figures- 
of-merit  are  defined  as  follows: 

1)  Signal  Loss.  The  loss  in  signal  power  in-bauid  due  to  the  filtering 
applied  by  the  interpolation  kernel  (a  measure  of  the  amount  of 
passband  attenuation). 

2)  Signal-to-Noise  Ratio  (SNR).  The  ratio  of  in-band  signal  power  after 
filtering  by  the  interpolation  kernel  (a  measure  of  the  relative 
strength  of  out-of-band  interpolation  noise  components). 

The  following  interpolation  filter  kernels  were  evaluated  in  this  m2unner: 

a)  4-point  cubic  convolution 

b)  4-point  cubic  B-spline 

c)  4-point  custom  FIR  filter 

d)  4-point  DFT  filter 

e)  6-point  DFT  filter 

f)  8-point  DFT  filter 

Cubic  convolution  is  a  4-point  polynomial  kernel  originally  derived  as  an 
efficient  approximation  to  the  sine  function  [Wolberg,90] .  The  c^ibic  B- 
spline  kernel  is  the  filter  defined  by  three  successive  .self-convolutions 
of  the  unit  recteingle  function  {Wolberg,90] .  The  custom  FIR  filter  is  a 
special  kernel  designed  by  Space  Computer  Corporation  using  the  Remez 
Exchange  optimization  algorithm  with  special  zero  placement.  The  DFT 
filters,  originally  derived  in  close-form  by  R.  Lucke  at  NRL 
[ Lucke, undated ] ,  are  obtained  by  confuting  the  inverse  DFT  of  the  N-point 
linear  phase  filter  in  the  spatial  frequency  domain  that  implements  a 
specified  image-domain  treinslation. 

Fig.  9-2  through  9-7  (in  Appendix  9.3)  plot  the  impulse  response  and 
frequency  response  for  each  of  the  above  kernels.  The  calculated  figures- 
of-merit  are  given  in  Table  4-1  below. 
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Kernel 

Signal  Loss  (dB) 

SNR  (dB) 

Cubic  Convolution 

0.202 

24.63 

Cubic  B-Spline 

1.110 

37.33 

Custom  FIR  Filter 

0.730 

33.83 

DFT-4  Filter 

0.004 

26.98 

DFT-6  Filter 

0.008 

31.86 

DFT-8  Filter 

0.000 

35.42 

Table  4-1;  Flqures-of-Merit  for  Interpolation  Kernels 

Basic  tradeoffs  ^^nong  the  kernels  are  evident.  For  exanple,  cubic 
convolution  has  a  good  passband  characteristic  but  relatively  poor 
sidelobe  response.  In  contrast,  the  cubic  spline  kernel  achieves 
excellent  sidelobes  at  the  expense  of  increased  in-band  attenuation.  The 
custom  4-point  FIR  filter  represents  a  somewhat  better  balance  between  the 
conflicting  desires  for  a  flat  passband  aund  low  sidelobes.  The  even- 
sample  DFT  kernels  are  also  interesting  in  that  they  combine  a  nearly 
perfect  passband  with  reasonably  good  sidelobe  levels. 

The  perfomance  differences  between  the  three  DFT  filters  illustrate  the 
general  trend  toward  improved  response  with  increasing  kernel  length  or 
number  of  coefficients.  Of  course,  a  resampling  algorithm  based  on  a 
longer  kernel  is  also  relatively  expensive  to  implement.  Unless  the 
signal-to-noise  ratio  is  unusually  high,  it  appears  that  a  well-chosen  4- 
point  kernel  should  suffice  for  most  change  detection  applications  with 
real  imagery.  The  registration  experiments  conducted  during  Phase  I 
utilized  resampling  algorithms  based  on  either  the  cubic  B-spline  or  the 
DFT-4  kernel. 

In  classical  interpolation  lore,  the  use  of  a  filter  with  a  smooth 
passbeund  response  is  considered  to  be  of  paramount  inportauice  to  minimize 
spectral  amplitude  distortion  of  the  "signal"  being  interpolated. 
However,  for  frame  difference  signal  processing  applications,  it  seems 
that  this  criterion  might  be  considerably  less  importeuit  them  the  out-of- 
band  sidelobe  level.  For  example,  how  importeuit  is  it  to  faithfully 
reproduce  the  spectral  content  of  the  unchemged  portion  of  the  backgrouuid 
it  is  is  going  to  be  subtracted  out  later?  Is  it  necessarily  bad  to 
attenuate  higher  in-band  frequencies  if  those  frequencies  are  dominated  by 
sensor  noise  and/or  aliasing  components  rather  them  by  basebemd  signal 
components . 

One  way  to  clarify  these  trades  for  electro-optical  sensor  images  wuui'l  l>e 
to  formulate  a  composite  signal  model  consisting  of  "change"  signal,  a 
remdom  background,  and  additive  noise;  properly  accounting  for  important 
sensor  effects  such  as  optical  blur,  detector  spatial  filtering  and  2-D 
focal  pleuie  sampling.  An  interpolation  filter  response  could  then  be 
applied  to  each  of  these  scene  components  and  used  to  derive  more 
meeuiingful  figures-of-merit,  such  as  the  effective  signal-to-clutter  + 
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noise  ratio  after  a  frame  differencing  operation.  Such  an  analysis  could 
be  pursued  under  a  Phase  II  program  to  help  guide  the  final  selection  of 
the  image  resanqpling  algorithm. 


4.4  Results 

4.4.1  Results  for  Sub-Pixel  Algorithm  #1 

Preliminary  1-D  simulation  of  a  square  wave  function  offset  to  .3  and  .9 
pixels  was  performed  for  VEXCEL's  Algorithm  #1.  No  noise  was  added  to  the 
square  wave  as  yet.  The  resulting  errors  for  estimation  were: 

o  4x10“^  pixel  for  .9  offset, 

o  5x10”^  pixel  for  .3  offset. 

4.4.2  Results  for  Sub-Pixel  Algorithm  #2 

Both  extensive  simulation  and  real  data  were  employed  for  testing  SCO's 
Algorithm  #2. 

4. 4. 2.1  Simulation  Experiments 

Simulation  experiments  were  performed  to  verify  the  accuracy  of  the  phase 
correlation  registration  algorithm  with  the  above  peak  interpolation 
scheme.  The  experiment  made  use  of  synthetically- jittered  frames  (without 
added  noise)  to  provide  a  controlled  test  of  measurement  quality  for 
several  variations  of  the  basic  algorithm. 

The  experiments  were  based  on  a  typical  long-wave  IR  scene  collected  by 
the  MIT  Lincoln  Laboratory  IRMS  sensor.  To  determine  the  accuracy  of  the 
sub-pixel  translation  measurements  provided  by  the  registration  algorithm, 
we  used  this  scene  to  create  a  synthetic  sequence  of  three  new  freunes 
having  known  displacements  in  the  elevation  (or  vertical)  direction.  This 
was  done  by  Fourier  '  transforming  the  original  256  x  256  IRMS  scene, 
bandlimiting  the  transform  with  a  2-d  Hanning  spectral  window,  and 
zerofilling  the  windowed  array  to  a  length  of  1024  points  in  the  vertical 
direction.  The  resulting  256  x  1024  array  was  then  transformed  back  to 
the  image  dcanain  to  obtain  a  filtered  scene  oversampled  by  a  factor  of  4:1 
in  elevation.  This  scene  was  then  vertically  decimated  by  a  factor  of  4, 
starting  from  three  different  rows  1,  2  and  3,  to  produce  a  set  of  three 
new  256  x  256  frames.  Adopting  the  first  of  these  new  fraunes  as  the 
registration  reference,  the  second  and  third  frames  have  respective 
vertical  displacements  of  precisely  -0.25  and  -0.50  pixel. 

As  a  check  the  accuracy  of  the  phase  correlation  algorithm,  we  mea; uipd 
the  know  2-D  shifts  between  the  synthetic  frame  pairs  2  and  1  and  3  and  i. 
The  measurements  obtained  from  several  variations  of  correlation 
processing  combined  with  pare±)olic  peak  interpolation  are  summarized  in 
Tables  4-2  and  4-3  below.  "Circular  Correlation"  refers  to  a  conventional 
(unwhitened)  cross-correlation;  it  produces  biased  measurements  in  either 
case.  In  contrast,  the  circular  phase  correlation  measurements,  after 
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apprqpriate  bias  correction,  are  nearly  perfect  for  both  the  Uniform 
(i.e.,  rectemgular)  spectral  window  and  a  heavily  tapered  Hanning^ 
spectral  window  function  defined  by 

w(m,n)  -  w(m)w(n)  -  (l+cos(2nin/N)  ]^  ( (l+cos(2iin/N) 

'Rie  calibration  results  given  here  establish  the  systematic  accuracy  of 
the  registration  algorithm  based  on  discrete  phase  correlation  ccxnbined 
with  an  appropriate  peak  measurement  procedure. 

4. 4. 2. 2  Application  to  Real  Data 

Although  {diase  correlation  can  be  a  remarkedt>ly  accurate  procedure  for 
measuring  the  sub-pixel  displacement  between  a  pair  of  well-correlated 
image  frames,  its  ability  to  register  moderately  correlated  Imagery  with 
long-term  changes  is  less  certain.  To  gain  insight  into  its  performance 
in  this  sitxiation,  we  processed  the  three-band  Death  Valley  infrared  data 
described  in  Section  2.4. 

The  basic  assimption  was  that  the  residual  registration  error  betvreen  the 
globally-registered  1983  and  1988  Death  Valley  images  could  be 
approximated  by  a  slowly-varying  2-D  translation.  Local  misregistration 
measurements  we.'®  obtained  by  applying  the  phase  correlation  procedure 
described  above  to  128x128  pixel  sub-blocks  overlapped  by  50%  in  either 
dimension.  For  the  512x512  scene,  this  resulted  in  a  total  of  49  2-D 
displacement  measurements  spaced  at  64-pixel  intervals  within  the  frame. 

Since  the  Death  Valley  scene  appeared  to  exhibit  more  long-term 
correspondence  in  its  principle  spectral  components  than  in  the  original 
spectral  barxis,  the  block  phase  correlation  procedure  was  separately 
applied  to  each  of  the  three  spectral  conqponents  of  the  1983  and  1988 
multi-beuKi  images.  For  each  block,  the  component  which  produced  the 
highest  correlation  peak  was  used  to  measure  the  2-D  translation  of  the 
1988  image  with  respect  to  the  1983  reference  image.  In  this  way,  the 
spectral  ccxnponent  with  the  best  local  feature  contrast  was  always  used  to 
estimate  the  local  misregistration. 

Figure  4-3  plots  the  local  treuislation  measurements  in  each  dimension  of 
the  Death  Valley  scene  vs.  the  correlation  block  number.  For  this 
purpose,  the  blocks  are  assigned  numbers  in  standard  raster  fashion 
starting  in  the  upper  right-hand  corner  of  the  scene.  Thus  correlation 
block  1  is  located  in  the  upper  right-hand  (or  northwest)  corner,  v4iile 
block  49  is  at  the  lower  left-hand  (or  southwest)  corner.  Vertical  dotted 
lines  are  drawn  at  7-block  intervals  to  indicate  measurements  taken  along 
the  same  horizontal  row  of  the  image.  These  measurements  indicate  the 
presence  of  more  or  less  reindora  multi-pixel  registration  errors  between 
the  Death  Valley  scenes  following  coarse  global  registration  hash’d  on 
manual  control  points. 

There  is  no  obvious  way  to  determine  the  accuracy  of  these  displacement 
measurements  in  the  absence  of  ground  truth  information.  However,  one 
reasonable  check  is  to  perform  a  second  fine  registration  step  based  on 
the  measurements,  and  determine  >4iether  the  pixel-to-pixel  correspondence 
between  the  two  image  sets  is  improved. 
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Tahip  4—2  Measurement  Results  for  Synthetically  Jittered  IRMS  Frames  1  and  2. 
Actual  Jitter  in  (Azimuth,EIevation)  =  (0,-0.25)  Pixel 


Cross-Correlation  Processing 

FFT  Size 

Cross-Spectrum 

Weighting 

2-D  Jitter 

Measurement 

2-D  Measurement 

Error 

Circular  Correlation 

256x256 

Uniform 

(-0.024,-0.236) 

(-0.024, 0.014) 

Circular  Phase  Correlation 

256x256 

Uniform 

( 0.000,-0.252)* 

( 0.000,-0.002) 

Circular  Phase  Correlation 

256x256 

2 

Hanning 

( 0.000,-0.250)* 

( 0.000, 0.000) 

Non-Circular  Phase  Correlation 

512x512 

Uniform 

(-0.001,-0.234)* 

(-0.001, 0.016) 

Non-Circular  Phase  Correlation 

512x512 

Hanning" 

( 0.000,-0.247)* 

( 0.000, 0.003) 

—  bias-corrected  measurements 


Table  4-3  Measurement  Results  for  Synthetically  Jittered  IRMS  Frames  I  and  3. 
Actual  Jitter  in  (Azimuth,Elevation)  =  (0,-0.50)  Pixel 


Cross-Correlation  Processing 

FFT  Size 

Cross-Spectrum 

Weighting 

2-D  Jitter 

Measurement 

2-D  Measurement 

Error 

Circular  Correlation 

256x256 

Uniform 

(-0.048.-0.499) 

(-0.048, 0.001) 

Circular  Phase  Correlation 

256x256 

Uniform 

( 0.000,-0.500)* 

( 0.000, 0.000) 

Circular  Phase  Correlation 

256x256 

7 

Hanning" 

(  0.000,-0.500)* 

( 0.000, 0.000) 

Non-Circular  Phase  Correlation 

512x512 

Uniform 

(-0.002,-0.445)* 

(-0.002. 0.055) 

Non-Circular  Pha.se  Correlation 

512x512 

Hanning" 

(-0.001.-0.491)* 

(-0.001.0.009) 

—  bias-corrected  measurements 
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To  do  this,  the  translation  measurements  made  in  each  dimension  were  fit 
to  cubic  thin-plate  splines  to  obtain  smooth  2-D  surfaces  of  displacement 
vs.  pixel  position  for  the  1988  image.  These  displacement  surfaces  were 
then  used  by  a  sliding-window  interpolator  to  resanyle  the  globally- 
registered  images  from  each  of  the  three  wavebands.  The  interpolator 
employed  for  this  purpose  was  the  cubic  spline  kernel  discussed  in  Section 
4.3. 

Two  basic  figures  of  merit  were  used  to  assess  whether  any  iin>rovement  was 
obtained  from  fine  registration.  The  first  was  the  global  average  pixel 
correlation  coefficient  between  corresponding  bands  of  registered  imagery, 
which  should  increase  if  the  unchanged  background  component  is  better 
aligned.  Table  4-4  shoes  that  small  increases  in  the  average  pixel 
correlation  were  indeed  observed  in  each  of  the  three  baunds  due  after  fine 
registration. 


TIMS  Pixel  Correlation  Coefficient 

Band  After  Global  Registration  After  Fine  Registration 


1 

0.57 

0.60 

3 

0.59 

0.62 

5 

0.69 

0.72 

Table  4-4. 

Pixel  Correlation 

in  Death  Valley  1983  and  1988  Images 

A  second  indicator  of  registration  performance  is  the  rms  level  of  a 
minimum-power  weighted  difference  frame  formed  from  each  registered  bamd 
pair,  which  is  affected  by  clutter  leakage  due  to  background  misalignment 
as  well  as  actual  long-term  changes  in  the  scene.  Tadsle  4-5  shows  that 
small  improvements  in  this  loeasure  are  also  observed  after  the  fine 
registration  step. 


TIMS 

Band 

RMS  Level  of  Difference 
After  Global  Registration 

Image  (/oW/sr/cm^//um) 

After  Fine  Registration 

1 

39.4 

38.2 

2 

46.1 

44.5 

3 

31.1 

36.3 

Table  4-5:  RMS  Level  of  Weighted  Difference  Image  for  Death  Valley  1903 
and  1988  Scenes 

Figure  4-4  compares  the  difference  images  from  TIMS  Band  1  on  the  same 
grey-scale  before  and  after  the  fine  registration  based  on  blockwise  phase 
correlation.  These  difference  images  are  qualitatively  similar  in  most 
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respects.  However,  in  several  places  denoted  by  black  arrows,  it  a^^ars 
that  the  fine  registration  may  have  reduced  the  amount  of  contrast  edge 
leakage.  Most  of  the  pixel-level  "changes"  observed  in  these  difference 
frames  result  from  pixel  radiance  variations  caused  by  local  teiif)erature 
ch^^lges  between  looks. 

It  appears  that  the  phase  correlation  technique  has  the  potential  to 
inqprgve  the  local  registration  of  multi-band  imagery  to  be  utilized  for 
change  detection.  However,  it  also  seems  clear  that  the  effectiveness  of 
this  algorithm  (and  all  other  area-based  correlation  methods)  will  be 
reduced  by  image- to- image  decorrelation  vdiich  results  frcxn  long-term  pixel 
intensity  fluctuations. 

4.5  Remaining  Problems 

Additional  testing,  including  2-D  offsets,  must  be  performed  for  Algorithm 

#1. 
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^  Potential  Change  Cueing 

The  fijndamental  problem  of  pixel-level  ch2mge  detection  is  to  utilize 
multiple  image  observations  of  the  same  scene  taken  at  different  times  and 
possible  different  sensors  to  identify  those  pixels  in  vdiich  significemt 
changes  have  occured.  Changes  of  military  interest  include  the  presence 
of  new  man-made  features  such  as  roads,  bridges  and  buildings  in  addition 
to  long-term  chauiges  in  the  boundaries  or  extent  of  natural  features  like 
forests  and  rivers. 

In  an  automated  change  detection  system,  the  primary  role  of  the  pixel- 
level  processing  is  to  increase  the  signal-to-clutter  ratio  of  desired 
change  features  to  the  point  where  higher  level  "object"  processing  can  be 
effective.  This  requires  a  high  degree  of  adaptive  cancellation  of 
undesired  "clutter"  due  to  unchang^  or  minimally-changed  portions  of  the 
background  scene.  If  robust  background  cancellation  is  achieved,  then  the 
potential  scene  changes  can  be  reliably  "cued"  on  the  basis  of  intensity 
for  use  in  higher-level  aunalysis. 

A  key  objective  of  the  Phase  I  effort  was  to  assess  how  well  this 
objective  could  be  met  through  the  application  of  multi-image  adaptive 
filtering  methods  based  on  statistical  decision  theory.  In  the  following 
subsections,  we  present  the  basic  theory  of  adaptive  change  detection, 
discuss  the  application  of  the  concept  to  multi-sensor  image  ^ta,  present 
the  results  of  chauige  detection  experiments  conducted  mder  Phase  I,  and 
describe  some  remaining  challenges  for  the  Phase  ll  effort. 

5.1  Theory  of  Adaptive  Change  Detection 

A  basic  theory  of  detection  is  formulated  below  for  the  case  of  an 
arbitrary  change  signal  observed  in  a  multi-image  background.  A 
performance  measure  for  pixel-level  change  enhancement  processing  is  also 
established. 

5.1.1  Modeling 


Consider  a  set  of  observations  (X),}".!  from  N  distinct  images.  For 
example,  the  might  represent  the  samples  observed  in  corresponding  co¬ 
registered  pixels  from  N  different  images  to  be  processed  for  pixel-level 
chamges  (although  it  should  be  noted  that  other  sets  of  "observables"  caui 
also  be  used,  as  noted  in  Section  5.2  below).  If  the  significant  changes 
in  these  observations  are  described  as  additive  signals,  then  the 
observation  in  each  image  can  be  modeled  by 

• 'N  (5-1) 

where  s^  auid  n^  represent  the  change  "signal"  and  the  comparatively 
unchamged  background  components  in  the  k^**  image.  Using  N-vector 
notation,  we  can  write  (1)  more  compactly  as 

X  -  s  +  n  (5-2) 
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For  modeling  purposes,  we  treat  the  background  n  statistically  by 
assigning  it  an  arbitrary  vector  mean  fj  and  an  NxN  covariance  matrix 

K  -  E(n-^)(n-^)T,  (5-3) 

where  "T"  denotes  the  matrix  transpose  and  "E"  indicates  an  ensemble 
average.  Ihis  is  a  commonly  used  model  for  natural  backgrounds  in  multi- 
image  remote  sensing  and  signal  processing  applications.  Its  utility  in 
describing  the  pixel  variations  over  local  regions  in  natural  scenes  has 
been  confirmed. 

The  cheuige  signal  s  is  modeled  as  a  deterministic  vector.  For  exeunple, 
the  ^vector  corresponding  to  a  chauige  feature  with  additive  intensity  s^ 
present  in  the  first  of  the  N  image  observations  would  be  denoted  by 

s  -  (Sj  0  0  •  •  •  OJT  (5-4) 

Ihe  use  of  a  general  formulation  for  the  chauige  signal  provides 
flexibility  in  modeling  situations  v^ere  changes  may  be  present  in 
differing  amounts  in  the  multiple  images.  A  typical  example  is  two  sets 
of  multi-band  imagery  collected  at  two  distinct  observation  times. 

Given  the  above  model,  it  is  well-known  that  the  optimum  linear  filter  for 
the  signal  s  is  given  by 

y(x)  -  s^'K-Mx-//)  (5-5) 

This  multi-dimensional  matched  filter  forms  a  linear  combination  of 
observations  which  is  optimum  in  the  sense  that  it  maximizes  the  ratio  of 
change  signal  magnitude  to  rms  background  clutter.  Note  that  for  the 
special  case  where  the  background  statistics  are  jointly  Gaussian,  this 
filter  is  the  maximum-likelihood  detector  for  the  signal  s.  For  the 
special  case  of  change  detection  in  a  single  image  with  N-I  available 
reference  images,  the  cibove  filter  also  corresponds  to  a  maximum- 
likelihood  detector  derived  by  (Margalit  et  al,85]. 

In  practice,  of  course,  the  joint  background  statistics  (/u  and  K)  needed 
to  implement  the  change  detection  filter  are  not  known  in  advauice,  so  they 
must  be  estimated  from  observed  data  by  appropriate  spatial  averaging. 
This  averaging,  which  forces  the  filter  to  adapt  to  the  actual  background 
conditions  encountered,  is  eui  inportauit  aspect  of  change  processing  and  is 
discussed  in  more  detail  in  Section  5.2. 

5.1.2  Change  Enhancement  Performance 

Ihe  signal  enhancement  performance  of  the  matched  change  detector  (5-"^)  is 
best  described  in  terms  of  the  signal-to-clutter  ratio,  formally  de'ined 
by 


SCR 


out 


'[Ey(x)  ]^ ;  change  signal  present  (H,  )]^/^ 
.Var  [y(x)];  si^al  not  present  (H^)  . 


Using  (5-3)  and  (5-5),  we  have 


<  5-6> 
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Ey(x)  under  -  s'^'K'^s 

Ey(x)  under  -  0 

Var{y(x)}  xander  -  Ey^  (x)-s'^K'^E{x-£)(x-^)''K"^^s‘’^IC^8 
The  output  SC31  (5-6)  is  then 

SCRout  -  (5-7) 

For  comparative  purposes,  it  is  useful  to  define  an  input  SCR  in  the 
image  prior  to  multi-image  change  processing  as 

SCR,  -  |sj/a,  (5-8) 

Where  |s^  I  and  represent  the  additive  signal  magnitude  and  background 
clutter  (power)  in  image  k,  respectively.  The  multi-image  change 

detection  processing  gain  with  respect  to  bank  k  is  then  defined  by  the 
ratio 


G  -  SCR„„t/SCR, .  (5-9) 

To  obtain  insight  into  the  factors  affecting  performance,  we  performed  a 
sanple  SCR  calculation  based  on  equicorrelated  background  observations  in 
N  images.  The  background  covarience  matrix  for  this  case  is 


K 


1  P 
P  1 


LP  P 


p' 

P 


1 


(5-10) 


where  is  the  background  power  in  each  image  emd  p  is  the  common 
correlation  coefficient  between  images.  The  inverse  of  (5-10)  is  given  by 


K-i-  {<H(l-p)}-i 


'oc  6 
0  a 


0 


e  0 


aj 


(5-11) 


xdiere 

a  -  1  -(p/(l+(N-l)p)]  (5-12a) 

0  -  -p/(l+(N-l)p)  (5-12b) 

The  output  SCR  may  be  calculated  from  (5-7)  and  ( 5-11 )-( 5-12 )  for  any 
specified  signal  vector  s.  For  the  change  detection  application,  the 
signal  is  often  assumed  to  be  present  in  just  one  of  the  images  (say,  the 
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Then  we  have 


first  one)  and  the  s-vector  is  given  by  (5-4). 

SCRout-  {Si/(a(l-p)V2]]  {i-[p/(i+(N-l)p]}V2  (5_i3) 

The  availedsle  SCR  in  the  change  image  alone  is 
SCRi  -  |sj/ff  (5-14) 

Ihus  the  chemge  detection  processing  gain  for  this  case  is 

G  -  SCR„„t/SCRi-  {{(l+(N-2'p}/{(l-p)[l+(N-l)p]}Ji/2 

Fig.  5-1  plots  the  SCR  gain  G  vs.  background  correlation  p  for  various 
numbers  of  images  N.  The  gain  is  fairly  small  if  p  is  less  than  0.9,  but 
very  large  gains  are  possible  vdien  the  background  observations  are  highly 
correlated  (p->l).  It  is  iirportant  to  note  that  most  of  the  available 
processing  gain  czui  be  obtained  from  only  2  image  observations,  one  which 
contains  the  change  signal  and  another  vdiich  serves  as  the  correlated 
reference  for  background  suppression.  Ihe  relative  change  enhancement  for 
the  dual-image  case  is  l/(l-p^)^/^. 

5.2  Application  to  Multi-Sensor  Imagery 

The  basic  principle  of  pixel-level  change  detection  is  relatively 
straightforward  as  outlined  above.  The  challenge  lies  in  the  application 
of  the  techniques  to  real-world  imagery.  Key  implementation  issues  which 
were  considered  in  the  Phase  I  study  include; 

1)  The  selection  of  imagery  to  support  change  detection  processing; 

2)  The  selection  of  the  set  of  "observations"  to  which  change  processing 
is  applied; 

3)  The  method  of  averaging  used  to  adapt  the  change  processing  to 
variedDle  background  conditions; 

4)  The  use  of  spatial  information  in  the  change  detection  process. 

These  issues  are  discussed  in  turn  below. 

5.2.1  Selection  of  Imagery 

The  analysis  is  Section  5.1  shows  that  the  most  important  parameter 
affecting  pixel-level  cheuige  detection  perfornance  is  the  image-to-image 
correlation  of  the  background  in  which  the  potential  changes  are  to  be 
detected.  A  higher  correlation  results  in  more  effective  background 
suppression,  vdiich  makes  small  changes  easier  to  discriminate  and  analyze. 

Experience  with  actual  sensor  imagery  indicates  that  high  values  of  the 
pixel  correlation  coefficient  can  in  fact  be  obtained  from  multiple  time 
observations,  provided  that: 

a)  The  images  are  accurately  registered  to  one  another; 

b)  A  close  relationship  exists  between  the  physical  mechcinisms  that 
generated  the  image  observations; 

c)  Ihe  observations  are  scheduled  to  minimize  decorrelating  effects  such 
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as  seasonal  reflectivity  variations,  shadowing,  solar  heating, 
thermal  lag  effects,  etc. 

As  a  general  rule,  the  observed  pixel  correlation  is  strongly  related  to 
the  phenomenological  similarity  between  various  types  of  images.  Thus, 
for  a  given  scene,  optical  images  tend  to  correlate  well  with  other 
optical  images,  somev^at  less  well  with  reflected-IR  images,  even  less 
with  thermal  IR  images,  and  quite  poorly  with  radar  images.  This  trend  is 
indicated  in  Table  5-1  below,  which  gives  the  measured  correlation 
coefficient  for  corresponding  512x512  pixel  blocks  of  pre-processed  Raisin 
City  imagery.  The  reference  band  TMl  is  located  in  the  blue-green  optical 
region  from  0.45-0.52  microns. 


Waveband  Pair 


Comments 


Correlation 

Coefficient 


TMl  X  TM3  optical  X  optical  (0 .63-0.69/^1)  0.98 
TMl  X  TM5  optical  X  near-IR  (1.55-1.75/um)  0.89 
TMl  X  TM6  optical  X  thermal-IR  (10. 40-12. 50/mi)  0.36 
TMl  X  SAR  optical  X  L-beuid  aircraft  radar  0.12 


Table  5-1;  Correlation  Coefficients  for  Raisin  City  Images 

Cki  this  basis,  one  would  predict  that  good  pixel-level  change  enhancement 
could  be  obtained  by  pairing  a  TMl  image  with  a  TM3  image,  but  that  little 
enh2uicement  would  result  from  processing  corresponding  pixels  in  the  TMl 
and  SAR  images.  Change  detection  experiments  discussed  in  Section  5.3 
generally  confirm  this  expection. 

Even  ^en  multiple  images  from  the  same  spectral  band  are  available  for 
chauige  detection,  diurnal  euid  seasonal  variations  in  the  imaged  intensity 
values  can  decorrelate  the  observed  data.  Although  certain  seasonal 
variations  may  constitute  changes  of  military  interest,  the  presence  of 
too  much  image-to-image  variation  can  make  it  very  difficult  to  achieve 
effective  image-to-image  registration  and  background  suppression.  At 
thermal  infrared  wavelengths,  for  exan^jle,  images  taken  at  night  often 
differ  considerably  from  daytime  images  due  to  the  lack  of  a  solar 
reflected  conponent  and  the  presence  of  temperature  differentials 
resulting  from  the  variable  thermal  lags  of  different  materials.  Seasonal 
variations  are  induced  by  changes  in  the  moisture  content  of  the  terrain 
or  vegetation;  i.e.  damp  soil  normally  has  a  lower  apparent  tempernture 
than  dry  soil  due  to  the  effect  of  evaporative  cooling. 

Certain  components  of  long-wave  IR  images  of  natural  scenes  can  be  fairly 
well  correlated,  even  when  taken  years  apart.  For  example,  Table  5-2  show 
the  measured  correlation  coefficients  for  various  image  pairs  from  the 
Death  Valley  scenes  collected  in  1983  and  1988.  Bands  1,  3  and  5  are  the 
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original  radieuice  images  after  global  registration.  The  "PC"  images  are 
the  three  principle  spectral  components  generated  from  Bands  1,  3  and  5 
and  shown  originally  in  Section  2.4 


Image  Pair 

Correlation 

TIMS  Band  1 

0.31 

TIMS  Band  2 

0.32 

TIMS  Band  3 

0.40 

PC  Image  1 

0.27 

PC  Image  2 

0.89 

PC  Image  3 

0.59 

TeJjle  2;  Pixel  Correlation  of  TIMS  Death  Valley  Data  Taken  in  1983  and 
1988 

Note  that  relatively  high  pixel  correlations  are  observed  in  the  second 
emd  third  principal  components  of  the  scene. 

For  change  detection  within  a  single  waveband,  the  use  of  multi-band  long¬ 
wave  thermal  imagery  in  the  8-12/um  region  appears  to  have  considerzdile 
potential.  Although  in  general  terrain  features  both  emit  and  reflect  IR 
radiation,  the  dominant  phenomenon  in  the  long-wave  region  is  greybody 
emission  as  a  function  of  object  ten^rature  and  emissivity.  Since 
undisturbed  natural  backgrounds  exhibit  relatively  little  long-term  change 
in  emissivity,  the  major  parameter  that  determines  the  apparent  radiance 
is  the  temperature.  High  correlations  between  thermal  images  taken  at 
different  times  can  be  expected  if  the  observations  are  carefully 
scheduled  to  minimize  differential  temperture  variations  due  to  diurnal 
and  seasonal  effects. 

5.2.2  Selection  of  Observations 


The  most  natural  observations  to  combine  in  a  cheinge  detection  operation 
are  the  corresponding  pixel  values  in  the  registered  images.  In  this 
case,  the  pixel  correlation  coefficient  determines  the  relative  amount  of 
change  enhancement  that  ceui  be  achieved. 

In  principle,  however,  one  can  utilize  any  set  of  observedDles  that  can  be 
derived  from  the  image  pixel  data.  An  interesting  alternative  which  was 
examined  in  the  Phase  I  study  is  the  use  of  image  spatial  frequency 
components,  rather  than  image  pixels,  as  the  input  "data"  to  the  chanue 
detection  process.  This  particular  choice  was  motivated  by  the  empirical 
observation  that  much  of  the  correspondence  between  displayed  images  from 
different  spectral  bands  resides  in  the  edges  and  other  high  frequency 
artifacts.  If  the  frequency  components  corresponding  to  these  features 
can  be  isolated  by  means  of  Fourier  analysis,  their  higher  correlations 
could  be  exploited  to  achieve  more  effective  clutter  cancellation. 
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Fig.  5-2  shows  a  portion  of  the  registered  TMl  and  Aircraft  SAR  images 
from  the  Raisin  City  site.  As  noted  above,  the  average  pixel  correlation 
coefficient  for  these  images  is  only  0.12,  implying  that  they  are 
practically  uncorrelated  at  the  pixel  level.  This  is  not  surprising  in 
view  of  the  different  sensors  used  aind  the  decorrelating  effect  of  the  SAR 
image  speckle.  However,  the  correspondence  between  some  of  the  boundaries 
in  these  scenes  is  evident. 

To  examine  the  sensitivity  of  the  correlation  coefficient  to  spatial 
frequency  for  this  image  pair,  the  following  procedure  was  used.  First, 
the  two  full  TMl  eind  SAR  images  were  sub-divided  into  16  128x128  pixel 
sub-blocks,  a  2-D  DFT  of  each  block  was  computed,  and  the  complex 
components  of  the  corresponding  blocks  from  each  image  were  conjugate 
multiplied  together  (i.e.,  correlated).  The  like-indexed  correlations  in 
all  blocks  were  then  averaged  together  and  normalized  to  obtain  an 
ensemble  average  correlation  coefficient  estimate  as  a  function  of  2-D 
spatial  frequency.  The  magnitude  of  the  resulting  correlation  array  is 
plotted  in  grey-scale  format  vs.  normalized  2-D  spatial  frequency  in  Fig. 
5-3(a),  where  black  indicates  a  moderate  positive  correlation  (0.6)  and 
white  denotes  zero  correlation.  The  histogram  of  spatial  frequency 
correlations,  shown  in  Fig.  5-3(b),  has  a  mean  value  of  0.23.  Although 
many  of  the  spatial  frequencies  are  almost  as  poorly  correlated  as  the 
pixels,  there  are  some  notable  exceptions.  For  example,  roost  of  the 
components  in  the  vicinity  of  zero  horizontal  ("X")  frequency  (oriented 
vertically  in  the  plot)  have  a  moderately  high  correlation  (0.6  or 
greater).  These  components  are  dominated  by  edges  with  a  near  horizontal 
orientation  in  the  scene. 

Change  detection  can  oe  inplemented  with  spatial  frequency  domain  in 
essentially  the  same  was  as  for  image  pixels,  with  the  necessary 
modifications  for  complex-valued  data.  Given  a  complex  vector  x 
representing  a  set  of  corresponding  spatial  frequency  components  from  N 
co-registered  image  blocks,  we  confute  the  linear  filter. 

y(x)  »  s^K'Mx-/^)  (5-16) 

where  s  is  the  change  signal,  'H'  denotes  the  conjugate  transpose,  and  fj 
and  K  now  represent  the  background  mean  eind  covariance  at  a  single  spatial 
frequency  of  interest.  This  same  process  can  be  performed  for  all 
available  frequency  conponents  in  each  block,  after  vdiich  the  results  can 
be  filtered  euid  transformed  back  to  the  image  domain  for  interpretation. 

5.2.3  Adaptation  to  the  Background 

Although  the  matched  change  detection  filter  described  in  Section  5.1  is 
an  optimum  linear  discriminant,  its  immediate  application  is  limit' d  i'v 
the  apparent  need  for  prior  descriptions  of  the  scene  backgioujK.1 
statistics  (i.e.,  mean  and  covariance).  A  standard  way  to  remove  this 
dependence  is  to  estimate  the  ensemble  statistics  by  averaging  over  a 
sufficiently  large  number  of  observations.  If  the  changes  to  be 
discriminated  are  small  in  magnitude  or  occupy  only  a  small  fraction  of 
the  observations  chosen,  then  their  contribution  to  these  averages  will  be 
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small  and  can  be  ignored  for  all  practical  purposes. 

For  pixel  Change  processing,  the  sinplest  approach  to  adaptivity  is  to 
perform  background  averaging  and  subsequent  filtering  over  the  entire 
scene.  A  potential  problem  with  this  method  is  that  the  background 
statistics  can  vary  considerably  with  changes  in  local  surface  content, 
leading  to  local  change  filter  mismatch.  The  use  of  small  spatial  windows 
alleviates  the  nonstationarity  problem  at  the  expense  of  a  reduced 
sensitivity  to  extended  cheuige  features.  For  dual-band  change  detection, 
it  cein  be  shown  that  a  covariance  estimate  based  on  at  least  49  vector 
samples  (i.e.,  a  7x7  window)  provided  detection  sensitivity  to  within  IdB 
of  the  perfectly  matched  filter.  However,  the  smaller  the  window,  the 
more  likely  it  is  that  larger  sized  changes  will  dominate  the  averages 
taken  in  particular  windows,  resulting  in  self-suppression  of  these 
features  during  the  matched  filtering  operation.  If  change  features  with 
a  very  wide  reinge  of  sizes  and  shapes  must  be  discriminated,  then  the  use 
of  a  pyramid  of  processing  window  sizes  may  be  appropriate.  A  more 
computationally-intensive  option  is  to  perform  brightness  and  texture- 
based  scene  segmentation  prior  to  change  processing,  to  identify  those 
regions  v4iich  appear  to  have  nearly  homogeneous  background  statistics. 

For  frequency  component  change  processing,  the  assumption  is  that  the  like 
components  taken  from  the  different  blocks  of  an  image  are  drawn  from  a 
stationary  ensemble,  but  that  the  statistics  of  different  components  can 
vary  arbitrarily.  The  averaging  is  therefore  performed  component  by 
component  across  the  DFT  blocks  in  the  image.  For  a  fixed  size  image,  a 
fundamental  tradeoff  exists  between  spatial  frequency  resolution  (which 
improves  with  a  larger  block  size)  and  the  number  of  independent  blocks 
available  in  the  image  (\^ich  decreases  with  the  use  of  larger  blocks). 

5.2.4  Use  of  Spatial  Information 

A  purely  pixel-level  change  detection  process  utilizes  the  information 
available  in  corresponding  pixels  of  two  or  more  images,  leaving  it  up  to 
the  interpreter  to  make  further  inferences  based  on  feature  size,  shape 
and  orientation.  However,  it  would  be  possible  to  optimally  combine 
information  from  multiple  pixels  for  automated  change  detection  based  on  a 
feature  "shape"  hypothesis.  A  general  formulation  of  this  approach, 
utilizes  a  pixel-level  change  filter  followed  by  a  matched  spatial  filter. 
Although  the  proper  use  of  shape  information  could  lead  to  higher  signal- 
to-clutter  ratios  for  chcuige  cueing,  several  potential  problems  arise. 
The  first  results  from  the  wide  range  of  potential  feature  sizes  and 
shapes,  which  would  require  the  use  of  a  large  beink  of  change  filters 
matched  to  various  objects.  Secondly,  the  preferential  spatial  filtering 
performed  by  this  approach  would  alter  the  appearance  of  objects  in  the 
scene,  making  higher  level  contextual  interpretation  more  difficult. 

Since  frequency  component  change  processing  is  carried  out  in  t.he  FocrieL 
transform  domain,  it  is  easy  to  modify  the  spatial  frequency  content  of 
the  change  image,  if  desired,  through  FIR  filtering.  The  frequency-domain 
equivalent  of  the  spatial  enhancement  approach  described  above  would 
involve  the  application  of  a  frequency-dependent  weighting  which  is 
proportional  to  the  spectrum  of  the  feature  of  interest.  However,  with  no 
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prior  spatial  information  about  the  features,  it  appears  that  a  better 
approach  is  to  simply  normalize  all  frequency  con^nents  so  that  they 
contribute  the  same  relative  power  to  the  final  chauige  image.  Since  the 
power  in  each  spectral  chauige  component  is  equivalent  to  its  variance, 
such  normalization  cam  be  achieved  by  modifying  (5-16)  to  inplement 

y(x)  -  s"K-^(x-/;)  (5-16) 

at  each  frequency  in  the  band.  This  spectral  "whitening"  appears  to  be 
quite  effective  for  change  detection,  as  seen  in  the  examples  below. 

5.3  C3iange  Detection  Results 

The  change  detection  concepts  discussed  above  were  qualitatively  tested  by 
processing  two  different  sets  of  imagery.  Preprocessed  Raisin  City  TM  and 
SAR  images  with  synthetically  injected  chamges  were  used  to  examine  the 
performance  of  variations  of  change  enhancement  processing.  Although  the 
Raisin  City  multi-band  data  set  does  not  contain  long-term  changes,  it  is 
still  useful  for  illustrating  the  effectiveness  of  change  processing 
applied  to  imagery  taken  from  different  bands.  Sensitivity  to  actual 
long-term  chauiges  was  examined  by  processing  TIMS  Death  Valley  long-wave 
infrared  data  collected  in  1983  and  1988. 

5.3.1  Raisin  City  Data 

The  Raisin  City  multi-band  data  set  consisted  of  7  Landsat  TM  bands  euid  3 
SAR  images  co-registered  to  one  another.  The  four  images  shown  in  Tedsle 
5-3  were  selected  for  change  processing  experiments. 


Band  Designation 

Comments 

TMl 

optical  band  (0.45-0.52/mi) 

TM3 

optical  band  (0.63-0.69Afln) 

TM5 

near-infrared  band  (1.55-i.75/um) 

SAR 

JPL  Aircraft  SAR  (L-band) 

Table  5-3:  Raisin  City  Images  Used  In  Change  Detection  Experiments 

The  TMl  optical  band  was  selected  as  the  "change  image".  For  reference 
purposes,  its  global  histogram  was  normalized  to  zero  mean  and  unit 
variance.  Change  features  of  varying  size  and  brightness  were  then 
synthetically  introduced  into  the  normalized  TMl  image  using  the  foui  step 
procedure  illustrated  in  Fig.  5-4.  The  first  step  was  co  inject  the 
following  change  signatures  shown  in  Fig.  5-  4(a)  by  direct  pixel 
intensity  replacement: 

a)  Small  Structures.  A  set  of  three  2x2  pixel  features  with  negative 
local  contrast  and  average  replacement  amplitude  -1; 
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b)  Road.  A  linear  feature  of  width  2  pixels  and  average  pixel 
replacement  amplitude  1,  corresponding  to  the  mean  intensity  of  a 
diagonally-oriented  road  in  the  scene; 

c)  Large  Structure.  A  rectangular  8x4  pixel  feature  with  positive  local 
contrast  and  average  replacement  amplitude  3; 

d)  Field.  A  rectangular  feature  of  size  30x35  pixels  with  average  pixel 
intensity  2,  comparable  to  that  of  several  other  brightly-reflecting 
fields  in  the  scene. 

The  actual  pixel  amplitudes  of  the  features  were  randomly  dithered  about 
their  average  values  to  simulate  the  local  rms  intensity  variation 
observed  in  the  ini  scene. 

The  remaining  three  steps  shown  in  Fig.  5-4  were  used  to  model  the  effect 
of  finite-resolution  TM  optics  on  the  injected  feature  responses.  The 
additive  change  image  shown  in  Fig.  5-4(b)  was  first  extracted  by 
subtracting  the  frame  shown  in  Fig.  5-4(a)  from  the  TMl  frame  prior  to 
injection.  Next,  the  additive  change  signals  were  convolved  with  a  2-D 
Gaussian  function  having  a  1-sigma  blur  circle  of  2  pixels  to  produce  the 
frame  irt  Fig.  5-4{c).  Finally,  the  blurred  signals  were  added  back  into 
the  original  TMl  image  to  generate  a  final  scene  containing  injected 
features  with  more  realistic  spatial  frequency  content,  as  shown  in  Fig. 
5-4 (d). 

Fig.  5-5  shows  the  full  512x512  IMl  frame  with  injected  changes,  referred 
to  below  as  TMIC.  A  number  of  change  detection  experiments  were  conducted 
using  the  TMlC  image  paired  with  other  Raisin  City  images  in  Table  5-3  to 
determine  how  well  the  change  features  could  be  enhanced  for  reliable 
cueing  on  the  basis  of  brightness;  typical  results  are  presented  below. 
It  should  be  eof^asized  that  all  of  the  change  images  shown  are  normalized 
and  presented  on  a  common  grey-scale,  so  that  direct  visual  comparisons 
among  them  can  be  made. 

The  TM1C-TM3  pair  is  used  to  illustrate  several  variations  of  the  change 
processing.  The  first  experiment  consisted  of  applying  the  change 
detection  filter  to  each  image  pixel,  using  global  averages  of  the 
background  mean  and  covariance  computed  over  the  entire  512x512  pixel 
scene.  The  resulting  output  frame  is  shown  in  Fig.  5-6.  The  synthetic 
change  features  are  obviously  enhanced  relative  to  Fig.  5-5  and  their 
shapes  are  well  preserved.  Several  other  portions  of  the  scene  are  also 
enhanced  by  the  process.  However,  it  is  difficult  to  determine  whether 
other  bright  regions  correspond  to  significamt  cheunges,  or  sinply  result 
from  poor  backgrovmd  suppression'  due  to  local  mismatch  in  a  change  filter 
based  on  global  scene  averages.  The  "patchiness"  of  the  change  image  in 
Fig.  5-5  in  a  direct  result  of  the  nonstationary  background  in  the  Raisin 
City  scene. 

An  obvious  way  to  improve  the  ability  of  the  change  filter  to  adapt  to 
local  background  conditions  is  to  utilize  smaller  averaging  windows.  Fig. 
5-7  shows  a  typical  result  for  pixel-level  change  processing  based  on  an 
averaging  block  of  size  16x16  pixels.  In  each  16x16  block,  the  local 
background  mean  and  covariance  were  used  to  compute  the  change  filter  to 
be  applied  to  the  256  pixels  of  that  block,  after  which  the  output  was 
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variauice-nonnalized  for  presentation  on  a  common  grey-scale.  Hie 
uniformity  of  the  suppressed  background  in  Fig.  5-7  is  em  inprovement  over 
Pig.  5-6,  but  the  smaller  averaging  window  clearly  results  in  significant 
self-suppression  of  the  change  features;  only  the  road  is  easily 
discernible. 

The  results  of  a  frequency-con^nent  change  detection  approach  are  shown 
in  Fig.  5-8.  To  generate  this  image,  the  bands  TMlC  and  TM3  were  each 
subdivided  into  16  blocks  of  size  128x128  pixels  which  were  transformed  to 
the  spatial  frequency  domain  by  2-D  DFTs.  The  adaptive  change  filter  (5- 
17)  was  then  applied  to  each  frequency  component  of  every  block,  based  on 
background  statistics  averaged  over  the  like  components  in  all  blocks. 
Finally,  the  processed  coiqxsnents  of  each  block  were  transformed  back  to 
the  image  domain  via  2-D  inverse  DFTs  and  mosaiced  to  form  the  output 
chemge  image  shown  in  Fig.  5-8.  Note  that  all  of  the  synthetic  changes 
stemd  out  very  well  against  the  suppressed  background.  Evidently,  the 
frequency  domain  technique  Cein  provide  superior  local  background 
suppression  without  excessive  attenuation  of  larger  chemge  features.  Hils 
is  a  very  valued)le  attribute  in  situations  where  a  wide  range  of  feature 
sizes  and  shapes  are  of  interest. 

Other  experiments  were  performed  to  confirm  the  original  expectation  that 
TM3  is  the  best  reference  band  for  change  detection, in  TMl.  For  exeunple. 
Fig.  5-9  and  5-10  show  the  results  of  frequency  component  chwge 
processing  applied  to  the  image  pairs  'ITilC-TM5  and  TNIC-SAR,  respectively. 
The  results  obtained  with  the  infrared  band  (m5)  in  Fig.  5-9  are  still 
rather  good,  but  the  overall  background  suppression  performauice  is  not  as 
high  as  in  Fig.  5-8  due  to  the  lower  correlation  between  the  optical  and 
infrared  bands.  The  results  for  the  SAR  image  in  Fig.  5-10  are  poorer 
still,  with  increased  edge  leakage  and  almost  no  enhancement  of  the  large 
"field”  feature. 
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6.  Preliminary  Analysis  of  Effects  of  Variations  in  Imaging  Scenarios 

One  concern  when  creating  algorithms  for  registration  and  change  detection 
is  the  robustness  of  such  procedures  over  the  varieties  of  imaging 
scenarios.  This  section  contains  some  observations  on  imaging 
variablities  for  SAR  in  6.1  and  optical  sensors  in  6.2. 

6.1  Variable  SAR  Scenario  Effects 

Radar  returns  from  vegetation  ceuiopies  consist  of  scattering  from  the 
vegetation  volume,  the  soil,  and  the  soil-vegetation  interaction.  These 
terms  depend  on  the  foliar  2uid  woody  biomass  and  the  soil  state.  Near 
vertical,  the  soil  term  becomes  dominant.  For  higher  incidence  angles, 
the  volume  term  dominates  for  lossy  ceinopies. 

Horizontal  (H)  polarization  couples  weakly  with  vertical  stalks,  whereas 
the  opposite  is  true  for  vertical  (V)  polarization.  Therefore,  as  a 
general  rule,  the  H  polarization  reveals  more  about  soils  2uid  the  V 
polarization  reveals  more  about  canopies.  As  always  for  a  given  set  of 
radar  parameters,  the  primary  determinauit  of  attenuation  in  vegetation  is 
water/unit  volume. 

Co-polarized  returns  are  usually  stronger  than  cross-polarized  returns. 
The  latter  are  caused  by  four  basic  mechamisms  [Fung,  Uladby,83]: 

o  Polarization  dependence  of  Fresnel  coefficients  for  quasi-specular 
reflection; 

o  Multiple  surface  scattering  induced  by  surface  roughness; 

o  Multiple  volume  scattering  due  to  inhomogeneities  within  the  skin 
depth; 

o  Physical  or  geometric  target  anisotropy. 

One  measure  of  the  degree  of  inhomogeneity  is  the  ratio  (a,/X)*,  vhere 
is  the  standard  deviation  of  the  dielectric  constemt  in  the  boundary  layer 
of  surface  scattering  or  of  the  spatial  discontinuities  in  volume 
scattering.  Therefore,  depolarization  increases  with  increasing 
frequency. 

Also  cross-polarization  (HV  and  VH)  returns  have  a  weaker  angular 
dependence  than  like  polarization  (HH  and  W)  returns  ,  especially  near 
the  vertical.  Therefore,  the  cross-polarization  ratio  ct„v/<^iih  increases 
with  increasing  angle  and  is  larger  for  crop  canopies  than  for  bare  soil. 
Therefore,  cross  polarization  is  generally  more  suitable  for  -lope 
studies . 

6.2  Variable  Optical  Scenario  Effects 

The  quantity  that  is  directly  measured  in  well-calibrated  electro-optical 
imagery  is  the  in-band  pixel  apparent  radiance.  In  general,  this  "sensed" 
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radiance  exhibits  a  complex  dependence  on  two  major  classes  of  parameters: 

1)  Physical  properties  of  the  terrain  features  being  imaged; 

2)  Sensor,  environmental  and  encounter  parameters  specific  to  the 
imaging  scenario. 

In  the  optical  and  near-IR  region,  the  primary  physical  attribute  of  a 
material  is  its  reflecteince  spectrum.  In-band  reflectance  is 
conventionally  measured  from  remotely-sensed  multi-bemd  imagery  by 
postulating  the  existence  of  a  uniform  illumination  function  across  the 
scene  and  normalizing  it  out  of  the  data.  A  large  body  of  experience  with 
actual  mulit-spectral  imagery  (e.g.,  Landsat)  has  shown  that  measured 
reflectance  spectra  of  terrain  regions  are  subject  to  significant  long¬ 
term  changes.  For  exaitple,  the  reflectance  of  healthy  ground  vegetation 
peaks  strongly  in  the  near-infrared  bands  due  to  relatively  high  moisture 
content,  while  for  stressed  vegetation  this  effect  is  far  less  pronounced. 
At  optical  wavelengths,  the  presence  of  a  surface  moisture  layer  on  opaque 
materials  tends  to  reduce  the  measured  reflectively. 

Sensed  radieince  in  the  thermal  infrared  is  a  fmction  of  material 
properties  (apparent  temperature  and  emissivity)  as  well  as  scenario- 
dependent  parameters  (atmospheric  path  transmission,  path  radiance,  solar 
reflection,  etc.).  Image  ^reprocessing  can  be  used  to  factor  out  the 
effects  of  certain  scenario^ependent  parameters  in  order  to  obtain 
physical  measurements  (i.e.,  tenperature) .  Standard  tools  such  as  the 
L0WTRAN7  computer  code,  a  comprehensive  atmospheric  model,  are  availadile 
for  this  purpose.  Change  analysis  is  complicated  not  by  bulk  variations 
in  temperature  across  a  scene  during  the  normal  course  of  a  day,  but 
rather  by  differential  teii^rature  changes  among  materials  with  variable 
thermal  inertias. 

The  encounter  geometry  also  influences  electro-optical  imagery,  although 
to  a  lesser  extent  than  microwave  radar  images.  Most  natural  surfaces  are 
diffuse  reflectors  and  emitters  of  optical  radiation,  and  can  be 
considered  Lambertain  to  a  first  approximation.  The  reflectivity  and 
emissivity  of  such  materials  do  not  show  a  strong  angular  dependence.  On 
the  other  heuid,  features  like  nan-made  metal  surfaces  and  smooth  bodies  of 
water  cam  appear  nearly  specular  at  optical  and  infrared  wavelengths; 
their  signatures  vary  considerably  with  the  imaging  aspect  and  the  solar 
angle.  Overhead  electro-optical  imagery  used  for  terrain  surface 
characterization  is  often  collected  early  or  late  in  the  day  to  minimize 
specular  reflection  effects. 

Environmentally-induced  variations  in  electro-optical  imagery  result  from 
the  spectral  content  of  the  illumination  and  local  atmospheric  rath 
conditions  (molecular  consituents,  temperature,  moisture  content,  etc.). 
For  example,  direct  solar  illumination  has  different  spectral  prope' ties 
than  indirect  illuminations  due  to  blackbody  emission  from  cl'.'uds. 
Atmospheric  conditions  can  generally  be  predicted  as  a  function  of  local 
meteorological  conditions,  time  of  year  and  earth  latitude  and  longitude. 

The  primary  sensor  parameter  affecting  observed  electro-optical  imagery  is 
the  waveband  of  operation,  since  it  determines  the  basic  phenomena  being 
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measured  (i.e.,  reflection,  emission,  or  both).  Secondary  sensor-induced 
variations  result  from  design  and  sensitivity  differences  among 
instruments,  such  as  scanning  vs.  staring  arrays,  detector  sensitivity  and 
spacing,  system  noise  levels,  and  so  on.  An  importauit  sensor-specific 
effect  that  frequently  inf)edes  electro-optical  image  processing  is  pattern 
noise  resulting  from  detector  responsivity  variation  across  the  focal 
plane  array,  as  well  as  gain  and  offset  imbalances  in  the  A/D  converters 
used  to  digitize  the  focal  plane  readout. 
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7. 


Ccmclusions 


An  overview  of  the  results  obtained  during  the  Phase  I  effort  and 
remaining  problems  appears  in  section  7.1.  Some  recommendations  for  the 
Phase  II  effort  are  discussed  in  section  7.2. 

7.1  Summary  of  Results  and  Remaining  Problems 

The  image-image  registration  efforts  were  initially  encouraging.  The  two 
types  of  rough  registration  methods  were  area-based  and  contour-based. 
These  algorithms  operated  under  the  assumption  of  little  terrain 
distortion  to  the  imagery.  An  contour-based  algorithm  for  dealing  with 
the  problem  of  considerable  terrain  relief  is  outlined.  Fuller 

developnent  and  testing  of  this  procedure  is  a  Phase  II  issue. 

A  two-stage  method  of  combining  the  area-based  and  contour-based 
algorithms  showed  some  improvement  over  the  performance  of  each 
separately.  In  particular,  small  residual  amoxjnts  of  rotation  on  the 
order  of  a  degree  were  removed  by  this  procedure. 

Two  sub-pixel  registration  estimation  algorithms  are  also  presented,  with 
extensive  testing  resu.’ts  so  far  available  for  one  of  them.  The  further 
investigation  of  the  other  algorithm  will  be  landertaken  in  Phase  II. 

The  following  statistics  summarize  registration  accuracies  achieved  in 
testing; 

o  area-based 

-  K-L  algorithm:  r  1  pixel  in  50%  of  cases, 

-  MNF  algorithm:  +  1  pixel  in  75%  of  cases, 

o  contour-based:  <  2  pixels  in  all  cases, 
o  combined  area-contour:  +  1  pixel  (using  MNF  area  method) 
o  sub-pixel  registration: 

-  Algorithm  #1:  ~  5x10“^  pixel  (using  lower  frequencies,  amd  without 

spectral  leakage  filtering) 

-  Algorithm  #2:  <  10"^  pixel  (best  results  using  Hanning  filter  for 

filtering  spectral  leakage)) 

The  sv±)-pixel  results  were  on  simulated  data. 

The  other  main  teclinical  area  investigated  was  change  cueing  on  the  pixel 
level,  given  a  registered  image  pair.  Both  actual  changes  and  simulated 
chcuiges  were  examined.  The  results  of  testing  revealed  excellent  rneiitu 
even  for  smaller  targets  as  long  as  the  local  image-image  correlati-'ii  of 
the  background  was  high,  with  the  performance  degrading  as  this 
correlation  decreases. 

The  remaining  problems/concerns  are: 
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o  Obtaining  a  wider  variety  of  multi-sensor  image  pairs  for  different 
types  of  scenes  and  sensor  parameters, 

o  More  extensive  testing  and  characterization  of  performance  of  present 
methods. 

7.2  Reconanendations  for  Phase  II 


Phase  II  should  continue  the  developnent  and  testing  of  automated  methods 
for: 

o  Area-based  rough  registration. 

o  Contour-based  registration, 

-  present  method  for  imagery  with  little  terrain, 

-  new  method  for  terrain-distorted  imagery. 

o  Sub-pixel  registration  method  #1. 

o  Cheinge  cueing, 

-  usage  of  physically  derived  quantities  in  EO  and  IR  imagery, 

-  frequency  space  euialysis  for  determining  background  correlation, 

-  target  model,  object-based  analysis. 

From  a  signal  processing  standpoint,  this  performance  dependence  on  the 
background  correlation  levels  is  unavaidable  for  pixel  level  procesing. 
However,  considerably  more  progress  can  be  achieved  using  processing 
methods  vdiich  effectively  increase  these  background  corelation  levels  by 
restricting  attention  to  selective  frequency  regions. 

Progress  beyond  \diat  can  be  achieved  using  such  enhanced  pixel-level 
processing  would  probably  require  higher-level  procedures  on  the  object- 
level.  Such  methods  will  be  required  to  make  hypotheses  on  the  existence 
of  objects  based  on  pattern  analysis,  as  opposed  to  simply  thresholding 
based  on  local  statistics.  The  investigation  of  such  object-based  target 
cueing  methods  will  require  the  use  of  target  models. 

For  optical  and  IR  imagery,  it  is  clear  that  the  performance  of  algorithms 
for  sub-pixel  registration  and  change  detection  will  depend  heavily  on  the 
degree  of  correlation  between  images.  Images  based  on  a  direct  observable 
such  as  pixel  apparent  radiance  can  be  significeuntly  decorrealted  over  the 
long-term  cheuiges  in  the  imaging  geometry,  season,  time  of  day,  amovint  of 
solar  reflection,  etc.  However,  it  appears  that  the  effects  of  certain 
variations  can  be  reduced  in  many  cases  by  converting  the  imagery  to  a 
more  fundamental  physical  quantity  prior  to  change  detection. 

In  the  thermal  infrared,  for  example,  the  sensed  radiance  is  a  functi'-n  oi 
actual  megaterial  properties  such  as  apparent  temperature,  and  emissivity, 
as  well  as  scenario-dependent  parameters  such  as  atmospheric  path 
transmission,  path  radiance,  solar  reflected  component,  etc.  The  ultimate 
goal  of  the  pre-processing  is  to  factor  out  the  effects  of  the  scenario- 
dependent  parameters  to  obtain  physical  measurements,  such  as  temperature, 
that  may  exhibit  a  greater  long-term  correspondence  between  well-scheduled 


103 


image  observations.  Available  tools  such  as  LCIMTRAN7  ceui  be  used  for  this 
purpose  during  Phase  II. 

Despite  the  encouraging  results  using  the  automated  methods,  it  is 
strongly  recommended  that  the  Phase  II  workstation  contain  capabilities 
for  interactive  as  well  as  automated  modes  of  operation.  Such  a  dual 
capability  allows  the  use  of  a  hronan  operator  to  examine  and  assess  the 
results  generated  by  automated  registration  procedures,  make  corrections 
if  needed,  euid  to  provide  initial  offsets  for  difficult  or  ambiguous 
cases . 

In  particular,  VE3CCEL  has  developed,  on  ainother  effort,  software  for  an 
electronic  light  table  v^ich  has  all  of  the  capapbilities  of  a  hardcopy 
light  table.  This  electronic  light  table  (LT)is  sensor  independent. 
Together  with  a  sensor  model,  LT  provides  all  of  the  image  nanipulation, 
visual  enhancement,  and  editing  facilites  for  image  registration. 

The  inclusion  of  VEXCEL's  LT  software  package  into  the  Phase  II  prototype 
will  provide  the  needed  interactive  image  registration  cap^J^ilities,  which 
can  then  be  interface  with  the  library  of  automated  techniques. 

For  chemge  detection,  the  use  of  automated  cheuige  cueing  requires  only 
selective  attention  by  the  operator,  but  uses  his/her  superior  judgement 
for  evaluating  cues  as  legitimate  targets.  Therefore,  LT's  visual  image 
enhancement  tools  can  be  used  for  examining  such  cued  potential  targets 
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9-5:  Landsat  TH  Band  s  >  1 . 55x/m-l . ,  Paisin  City,  CA 
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9.2  Polarimetry  Definitions 

The  fcllowing  definitions  are  useful  in  the  analyses  of  polarinietric 
measurements: 

<>  -  ensemble  averaging 

I  i  <■  absolute  value  of  comple.'f  quantity 

-  complex  conjugation 

R( )  •»  real  portion  of  complex  quantity  ( ) 

i()  -  imaginary  portion  of  complex  queuitity  () 

y'  »  polarization  ellipse  orientation  angle, 

X  -  polarization  ellipse  ellipticity  angle,  -ii/4<X<it/4;  X>0 

corresponds  to  left-handed  sense  of  polarization,  X“0  corresponds 
to  linear  polarization,  and  X-n/4  corresponds  to  full  circular 
polariLation 

d  -  fraction  of  signal  power  that  is  fully  polarized,  0<d<l 
{S}  -  Stokes  vector,  a  real  valued  description  of  polarization 
Sg  -  first  Stokes  parameter,  corresponds  to  power 
Sj  ■  second  Stokes  pv^ameter,  -  Sgd  cos(2\i;)cos(2x) 

Sj  -  third  Stokes  paramter,  -  Sgd  sin(24»)cos(2x) 

Sj  -  fourth  Stokes  parameter,  -  S^d  siniZ.j') 

[s]  -  (2x2)  complex  scattering  matrix  relating  transmitted  to  received 
polarization;  elememts  of  matrix  [s]  are  as  follows: 

Sj^i*  complex  coefficient  relating  horizontal  received  conponent 
to  horizontal  transmitted  component 

s,  2-  complex  coefficient  relating  vertical  received  component  to 
horizontal  transmitted  component 

Sji"  conplex  coefficient  relating  horizontal  received  component 
to  vertical  transmitted  component 

Sj2=  complex  coefficient  relating  vertical  received  component  to 
vertical  transmitted  component 
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[F] 


(4x4)  real  Mueller  or  phase  matrix,  relating  ensemble  averaged 
ititi  ov  Stial)  received  and  transmitted  Polarization  as 
described  by  Stokes  vector;  [F]  is  skew  symmetric  if  Is]  is 
symmetric;  elements  of  matrix  [F]  are  as  follows: 


Fi,-  (ISi 


|S,2  I*  +  ISjl  l®2 


R(Sii*Si2  )  +  R(S2  2*S21  ) 


*^14*  1  (  Si  2  *Si  1  )  +  1  (  S2  2  *®2  1  ) 

F21-  (ISiil^  +  IS12I'  -  +  lS22l')/2 

F22*  USiil^  -  I  Si  2!^  -  |S2ll^  +  |S22l*^/2 

F23-  R(Sii*Si2  )  -  R(S22*S2i  ) 

F24-  lCSi2*Sii)  +  1(821*822) 

F31*  F(  Si  1*821)  +  ^(S2  2*^12^ 

Fj  2  R(  Si  1  *82  1  )  -  R(  S2  2  *Si  2  ^ 

Fj  3  -  R(  82  2  *Si  1  )  +  R(  Si  2  *S2  1  ) 

F34-  I(Sn*S21  )  +  I(Si2*S22) 

F41-  1(811*821  )+  I(Si2*S22  ) 

F42-  I(Sii*S2i)  +  l(S22*Si2) 

F43-  1(811*822)  +  I(Si2*S2i) 

F44-  R(  Si  1*82  2)  “^(Si2*S2i) 
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(a)  Impulse  Response 


(b)  Frequency  Response 
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Fig.  9-9;  Cubic  B-Spl: 
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(a)  Impulse  Response 
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9.4  LCWTRAN7  Atmospheric  Model 


LCWTRAN7  is  a  powerful  con^ter  software  package  developed  by  the  Air 
Force  Geophysics  Laboratory  to  compute  the  effects  of  a  wide  range  of 
atmospheric  conditions  within  a  given  transmission  path.  It  can  calculate 
the  atmospheric  transmittance  and  background  radiauice,  and  single  and 
multiple  scattered  solar  and  thermal  radiance  within  a  spectral  resolution 
of  20  cnr^  over  the  wavelength  region  [.2>um,®). 

LOWTRAN  uses  a  single-parcimeter  band  model  for  the  molecular  absorption 
and  includes  molecular  scattering,  aerosol  and  hydrometer  ^Q3sorption  and 
scattering.  If  enough  information  2dx)ut  a  given  image  sensing  scenario  is 
availadsle  (time,  date,  geoemtry,  etc.),  then  it  is  possible  using  LCWTRAN7 
to  conpate  the  atmsopheric  contribution  to  sensed  radieunce  images.  Once 
these  effects  have  been  conpited,  the  images  can  first  be  converted  into 
source  radiance  and  subsequently  to  source  temperature.  The  latter  step 
is  accomplished  by  inverting  the  measured  source  radiance  and  the  imaging 
wavebemd(  s) . 
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